Hello all,
In doing some profiling of our Lucene code, I noticed that we were doing
an optimize code after every update to our index. Though our index is
relatively small (~75MB), the optimize task still look way to much time
to run.
I did some research and it seems like it would not be an issue to update
our index without optimizing afterwords, the side effect being that we'd
have more open file handles.
I made that change and noticed some horrible performance side effects.
The first thing I noticed was that the CPU for our web application
(ASP.NET MVC) that read from the Index never went below 60-70% and was
frequently pegged at 99%.
In addition to the CPU spiking, the memory taken up by the w3wp.exe
process quickly grew to around 800MB, which is about 300MB above normal.
This has all the hallmarks of a memory leak somewhere.
Finally, I noticed that the IndexReader was locking some of the files in
the index folder even though the reader was set to nolock mode. This
seemed to be cause of the increase in the number of files in the index
folder.
We have the IndexReader set to open once and then be shared among every
request to the web application. My understanding is that this is the
correct way to do this, and this never caused and issues when we were
optimizing the index after every update.
I know this is a pretty vague problem and there could be any number of
issues involved here. However, if anyone could suggest areas to look
into for possible solutions, it would be greatly appreciated.
Thanks,
Jeff