Are you calling IndexWriter.commit when you shut down the app? Mike McCandless
http://blog.mikemccandless.com On Tue, Aug 25, 2015 at 11:49 PM, Loamy Hound <loamy.ho...@gmail.com> wrote: > *Summary:* > > Lucene indexes appear to revert to some past state after an application > restart. > > *Background:* > > We're running an enterprise application written in Java/Spring/Hibernate, > deployed within Jetty, with a Postgres backend. See below for version info. > > We use Lucene to index certain components of the database to enable > fast/complex searching. > > The indexes are built by querying the relevant database tables, > transferring the data to Lucene documents and writing to disk. > > An IndexWriter is used to add and commit the documents. A commit is > performed at the end of a batch of database reads (generally 5,000). The > reading and writing of batches is multi-threaded. > > The writer is configured with the following TieredMergePolicy attributes: > > segmentsPerTier=50.0 > maxMergeAtOnce=5 > maxMergedSegmentMB=100.0 > > > No merge scheduler is set. The writer has its RAMBufferSizeMB set to 48. > > There are 23 separate indexes used to represent different logical > components of the database. > > The largest index on disk is 13.7G. > > The largest index by number of documents contains around 32 million > documents. > > Once the indexes are built they are maintained dynamically by the > application to reflect the current state of the database. Dynamic updates > are performed by a TrackingIndexWriter. > > *Problem:* > > After a reindex is run (as described above, a destructive process) the > application runs okay and all Lucene queries return expected values that > reflect the current state of the database. > > Subsequent usage of the system maintains the indexes in the correct state > as evidenced by search results. > > In the last month we have found that after a restart of the application the > indexes appear to revert to some unknown past state. The indexes can be > queried okay (they're not corrupt, there are no logged errors or stack > traces) but the data is either out of date (reflecting a past state of the > database entries they represent) or missing. > > We first assumed the "past state" was based on the last reindex time, but > have subsequently found that restarting the application immediately > following a reindex still puts the indexes in a state that pre-dates the > time of the last reindex. > > This is only occurring on a single site (our largest production site), and > has only started in recent months. We have yet to reproduce the problem > using an identical process with an identical configuration on > near-identical data. > > We are not sure if the problem effects all of the indexes but know the > larger (and most important) indexes are effected. > > *Question:* > > We are inclined to think that the problem is somewhere in our code, but are > wondering if any of the described symptoms have been seen before by the > Lucene community. Suggestions on how to isolate the problem, or > configuration changes that may help are also most welcome. > > *Version Info:* > > Lucene: > > lucene-analyzers-common-4.9.1.jar > lucene-core-4.9.1.jar > lucene-grouping-4.9.1.jar > lucene-join-4.9.1.jar > lucene-misc-4.9.1.jar > lucene-queries-4.9.1.jar > lucene-queryparser-4.9.1.jar > lucene-sandbox-4.9.1.jar > lucene-snowball-2.4.1.jar > lucene-suggest-4.9.1.jar > > Postgres: > > server: PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) > 4.4.7 20120313 (Red Hat 4.4.7-4), 64-bit > client access: postgresql-9.1-901.jdbc4.jar > > OS: > > LSB_VERSION=base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch > Red Hat Enterprise Linux Server release 6.5 (Santiago) > > Java: > > java version "1.8.0_45" > Java(TM) SE Runtime Environment (build 1.8.0_45-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode) > > Jetty: > > jetty-6.1.22.jar > > Hibernate: > > hibernate-commons-annotations-4.0.2.Final.jar > hibernate-core-4.2.2.Final.jar > hibernate-ehcache-4.2.2.Final.jar > hibernate-jpa-2.0-api-1.0.1.Final.jar > > Spring: > > spring-aop-4.0.4.RELEASE.jar > spring-aspects-4.0.4.RELEASE.jar > spring-beans-4.0.4.RELEASE.jar > spring-context-4.0.4.RELEASE.jar > spring-context-support-4.0.4.RELEASE.jar > spring-core-4.0.4.RELEASE.jar > spring-expression-4.0.4.RELEASE.jar > spring-instrument-4.0.4.RELEASE.jar > spring-jdbc-4.0.4.RELEASE.jar > spring-jms-4.0.4.RELEASE.jar > spring-orm-4.0.4.RELEASE.jar --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org