We are using lucene 2.4.
Michael McCandless-2 wrote: > > > Which version of Lucene are you using? > > More questions/answers below... > > [email protected] wrote: > >> We scan web and index pages in lucene. Our index size is in the >> range of >> 500K to 1 million documens. As we index pages, we also call >> IndexWriter.optimize after certain time intervals [I believe Lucene >> also >> does optimization in the background ?]. > > Actually Lucene merges segments periodically in the background, but does > not optimize. > >> So far it has worked great. But for >> just this one scan we noticed that the our index size grew to 90 GB >> for >> about 900K documents [typical index size should be around 17-18GB]. >> We are >> not sure what caused the index to grow this large. Outside of our >> system, >> when we did a forced IndexWriter.optimize() on this 90 GB lucene >> index, it >> indeed shrinked to 17 GB. My question is what may have caused the >> size to >> grow to 90GB? > > Optimize requires free temporary disk space equal to 1X the index size. > > Do you have an IndexReader open on the index when optimize runs? That > ties up another 1X. > > That should mean a 17-18GB index takes 51-54 GB, so I'm not sure why > you got up to 90 GB. There we no exceptions, even in BG merge threads? > > Are you reopening readers while optimize is running? In theory that > could > tie up even more disk space (eg if you didn't close the old readers). > >> Did the size grow because optimization failed ? > > If optimization fails it would remove the partially written files, so > I don't think > this would explain too-high disk usage. > >> Does >> optimization fail if there is any foreign file in the lucene index >> directory >> [though we tried optimizing with foreign files in lucene directory, >> and >> lucene still did optimize the index.] > > Foreign files are harmless as long as they don't conflict w/ Lucene's > file names. > > Mike > > -- View this message in context: http://www.nabble.com/InderxWriter.optimize%28%29-fail-tp21937277p21944987.html Sent from the Lucene - General mailing list archive at Nabble.com.
