Simone, Are you using any field caches or filters?
In versions prior to 2.9, reopening the index will completely rebuild the field cache and filter bits for all documents in the index, which can result in an increase in memory consumption. In 2.9 and future versions, the field cache and filter bits are cached at a segment level, which results in significantly faster re-opens as only the new segments are loaded into the caches. Our applications use very large indexes and 2.9's segment level caching allows us to re-open indexes much faster while utilizing less memory in the process. Michael -----Original Message----- From: Simone Chiaretta [mailto:[email protected]] Sent: Wednesday, January 06, 2010 10:01 AM To: [email protected] Subject: Re: Possible memory leak in Lucene.NET 2.4? What I am doing is initializing the writer in the App_Start event of the web app, and closing everything at the App_End event. For the reader, I start it at the first search request, re-open it everytime a new document is added, and then closing it in the App_End If you are interested here is the search engine service I'm using: http://code.google.com/p/subtext/source/browse/trunk/src/Subtext.Framewo rk/Services/SearchEngine/SearchEngineService.cs Simone On Wed, Jan 6, 2010 at 6:31 PM, Matt Honeycutt <[email protected]>wrote: > Won't the various global application events be fired if the app pool is > gracefully terminated/recycled? While not ideal, couldn't you initialize > your Lucene objects during one of the application initialization, then > dispose of them in the corresponding shutodwn events? > > On Wed, Jan 6, 2010 at 11:14 AM, Michael Garski <[email protected] > >wrote: > > > If it's not an option to create search functionality in a separate > process, > > such as in a shared hosting environment, you may be limited in the size > of > > your index and how you query it. The field cache, and to a lesser extent > > filters, will consume a fair amount of memory that is proportional to the > > number of documents in the index. > > > > As others have mentioned, you will have to ensure that resources are > > released when the app pool recycles. > > > > Michael > > > > -----Original Message----- > > From: Simone Chiaretta [mailto:[email protected]] > > Sent: Wednesday, January 06, 2010 12:45 AM > > To: [email protected] > > Subject: Re: Possible memory leak in Lucene.NET 2.4? > > > > Unfortunately not everybody can use another process: I'm building a > > blog engine that must be able to run on shared hosting provider. The > > 2nd process is not an option :) > > > > Simone > > > > On Tuesday, January 5, 2010, Digy <[email protected]> wrote: > > > As Michael stated, I prefer also not hosting "indexing and searching > > > sevices" in IIS. > > > There are many alternatives such as WCF, Remoting etc. With a separate > > > service for Lucene, you can control anything you want. > > > > > > DIGY > > > > > > -----Original Message----- > > > From: Michael Garski [mailto:[email protected]] > > > Sent: Tuesday, January 05, 2010 11:11 PM > > > To: [email protected] > > > Subject: RE: Possible memory leak in Lucene.NET 2.4? > > > > > > Jeff, > > > > > > Correct - there is no need to optimize the index after adding a > > > document, and I would recommend against it especially when you move to > > > 2.9 as you will not see any of the benefits of the changes to composite > > > readers such as faster incremental warm-ups to filters and field > caches. > > > > > > I've never run Lucene.Net in the context of a web process and would > > > actually recommend against that approach due to app pool recycling, > > > opting for a service that exposed search functionality via WCF. > > > > > > What types of queries are you executing? Are you using filters or > > > sorting? How often do you re-open the IndexReader that is used for > > > searching? Re-opening the reader after each document addition can be > an > > > expensive process, especially if you are using filters and/or sorts. > > > How are you refreshing the IndexReader? > > > > > > Regarding the IndexReader locking files, this is a feature which allows > > > you to concurrently index and search on the same index and not have to > > > worry about the IndexWriter deleting a segment file from underneath the > > > searcher when a segment merge occurs. > > > > > > The first place to look would be to use a memory profiler to determine > > > what is actually consuming the memory. I use the SciTech .NET Memory > > > Profiler for such purposes. > > > > > > Michael > > > > > > -----Original Message----- > > > From: Jeff Pennal [mailto:[email protected]] > > > Sent: Tuesday, January 05, 2010 12:42 PM > > > To: [email protected] > > > Subject: Possible memory leak in Lucene.NET 2.4? > > > > > > Hello all, > > > > > > In doing some profiling of our Lucene code, I noticed that we were > doing > > > > > > an optimize code after every update to our index. Though our index is > > > relatively small (~75MB), the optimize task still look way to much time > > > to run. > > > > > > I did some research and it seems like it would not be an issue to > update > > > > > > our index without optimizing afterwords, the side effect being that > we'd > > > > > > have more open file handles. > > > > > > I made that change and noticed some horrible performance side effects. > > > > > > The first thing I noticed was that the CPU for our web application > > > (ASP.NET MVC) that read from the Index never went below 60-70% and was > > > frequently pegged at 99%. > > > > > > In addition to the CPU spiking, the memory taken up by the w3wp.exe > > > process quickly grew to around 800MB, which is about 300MB above > normal. > > > > > > This has all the hallmarks of a memory leak somewhere. > > > > > > Finally, I noticed that the IndexReader was locking some of the files > in > > > > > > the index folder even though the reader was set to nolock mode. This > > > seemed to be cause of the increase in the number of files in the index > > > folder. > > > > > > We have the IndexReader set to open once and then be shared among every > > > request to the web application. My understanding is that this is the > > > correct way to do this, and this never caused and issues when we were > > > optimizing the index after every update. > > > > > > I know this is a pretty vague problem and there could be any number of > > > issues involved here. However, if anyone could suggest areas to look > > > into for possible solutions, it would be greatly appreciated. > > > > > > Thanks, > > > Jeff > > > > > > > > > > > > > -- > > Simone Chiaretta > > Microsoft MVP ASP.NET - ASPInsider > > Blog: http://codeclimber.net.nz > > RSS: http://feeds2.feedburner.com/codeclimber > > twitter: @simonech > > > > Any sufficiently advanced technology is indistinguishable from magic > > "Life is short, play hard" > > > > > > > -- Simone Chiaretta Microsoft MVP ASP.NET - ASPInsider Blog: http://codeclimber.net.nz RSS: http://feeds2.feedburner.com/codeclimber twitter: @simonech Any sufficiently advanced technology is indistinguishable from magic "Life is short, play hard"
