Simone,

Are you using any field caches or filters?  

In versions prior to 2.9, reopening the index will completely rebuild
the field cache and filter bits for all documents in the index, which
can result in an increase in memory consumption.  In 2.9 and future
versions, the field cache and filter bits are cached at a segment level,
which results in significantly faster re-opens as only the new segments
are loaded into the caches.

Our applications use very large indexes and 2.9's segment level caching
allows us to re-open indexes much faster while utilizing less memory in
the process.

Michael

-----Original Message-----
From: Simone Chiaretta [mailto:[email protected]] 
Sent: Wednesday, January 06, 2010 10:01 AM
To: [email protected]
Subject: Re: Possible memory leak in Lucene.NET 2.4?

What I am doing is initializing the writer in the App_Start event of the
web
app, and closing everything at the App_End event.
For the reader, I start it at the first search request, re-open it
everytime
a new document is added, and then closing it in the App_End

If you are interested here is the search engine service I'm using:
http://code.google.com/p/subtext/source/browse/trunk/src/Subtext.Framewo
rk/Services/SearchEngine/SearchEngineService.cs

Simone

On Wed, Jan 6, 2010 at 6:31 PM, Matt Honeycutt
<[email protected]>wrote:

> Won't the various global application events be fired if the app pool
is
> gracefully terminated/recycled?  While not ideal, couldn't you
initialize
> your Lucene objects during one of the application initialization, then
> dispose of them in the corresponding shutodwn events?
>
> On Wed, Jan 6, 2010 at 11:14 AM, Michael Garski
<[email protected]
> >wrote:
>
> > If it's not an option to create search functionality in a separate
> process,
> > such as in a shared hosting environment, you may be limited in the
size
> of
> > your index and how you query it.  The field cache, and to a lesser
extent
> > filters, will consume a fair amount of memory that is proportional
to the
> > number of documents in the index.
> >
> > As others have mentioned, you will have to ensure that resources are
> > released when the app pool recycles.
> >
> > Michael
> >
> > -----Original Message-----
> > From: Simone Chiaretta [mailto:[email protected]]
> > Sent: Wednesday, January 06, 2010 12:45 AM
> > To: [email protected]
> > Subject: Re: Possible memory leak in Lucene.NET 2.4?
> >
> > Unfortunately not everybody can use another process: I'm building a
> > blog engine that must be able to run on shared hosting provider. The
> > 2nd process is not an option :)
> >
> > Simone
> >
> > On Tuesday, January 5, 2010, Digy <[email protected]> wrote:
> > > As Michael stated, I prefer also not hosting "indexing and
searching
> > > sevices" in IIS.
> > > There are many alternatives such as WCF, Remoting etc. With a
separate
> > > service for Lucene, you can control anything you want.
> > >
> > > DIGY
> > >
> > > -----Original Message-----
> > > From: Michael Garski [mailto:[email protected]]
> > > Sent: Tuesday, January 05, 2010 11:11 PM
> > > To: [email protected]
> > > Subject: RE: Possible memory leak in Lucene.NET 2.4?
> > >
> > > Jeff,
> > >
> > > Correct - there is no need to optimize the index after adding a
> > > document, and I would recommend against it especially when you
move to
> > > 2.9 as you will not see any of the benefits of the changes to
composite
> > > readers such as faster incremental warm-ups to filters and field
> caches.
> > >
> > > I've never run Lucene.Net in the context of a web process and
would
> > > actually recommend against that approach due to app pool
recycling,
> > > opting for a service that exposed search functionality via WCF.
> > >
> > > What types of queries are you executing? Are you using filters or
> > > sorting?  How often do you re-open the IndexReader that is used
for
> > > searching?  Re-opening the reader after each document addition can
be
> an
> > > expensive process, especially if you are using filters and/or
sorts.
> > > How are you refreshing the IndexReader?
> > >
> > > Regarding the IndexReader locking files, this is a feature which
allows
> > > you to concurrently index and search on the same index and not
have to
> > > worry about the IndexWriter deleting a segment file from
underneath the
> > > searcher when a segment merge occurs.
> > >
> > > The first place to look would be to use a memory profiler to
determine
> > > what is actually consuming the memory.  I use the SciTech .NET
Memory
> > > Profiler for such purposes.
> > >
> > > Michael
> > >
> > > -----Original Message-----
> > > From: Jeff Pennal [mailto:[email protected]]
> > > Sent: Tuesday, January 05, 2010 12:42 PM
> > > To: [email protected]
> > > Subject: Possible memory leak in Lucene.NET 2.4?
> > >
> > > Hello all,
> > >
> > > In doing some profiling of our Lucene code, I noticed that we were
> doing
> > >
> > > an optimize code after every update to our index. Though our index
is
> > > relatively small (~75MB), the optimize task still look way to much
time
> > > to run.
> > >
> > > I did some research and it seems like it would not be an issue to
> update
> > >
> > > our index without optimizing afterwords, the side effect being
that
> we'd
> > >
> > > have more open file handles.
> > >
> > > I made that change and noticed some horrible performance side
effects.
> > >
> > > The first thing I noticed was that the CPU for our web application
> > > (ASP.NET MVC) that read from the Index never went below 60-70% and
was
> > > frequently pegged at 99%.
> > >
> > > In addition to the CPU spiking, the memory taken up by the
w3wp.exe
> > > process quickly grew to around 800MB, which is about 300MB above
> normal.
> > >
> > > This has all the hallmarks of a memory leak somewhere.
> > >
> > > Finally, I noticed that the IndexReader was locking some of the
files
> in
> > >
> > > the index folder even though the reader was set to nolock mode.
This
> > > seemed to be cause of the increase in the number of files in the
index
> > > folder.
> > >
> > > We have the IndexReader set to open once and then be shared among
every
> > > request to the web application. My understanding is that this is
the
> > > correct way to do this, and this never caused and issues when we
were
> > > optimizing the index after every update.
> > >
> > > I know this is a pretty vague problem and there could be any
number of
> > > issues involved here. However, if anyone could suggest areas to
look
> > > into for possible solutions, it would be greatly appreciated.
> > >
> > > Thanks,
> > > Jeff
> > >
> > >
> > >
> >
> > --
> > Simone Chiaretta
> > Microsoft MVP ASP.NET - ASPInsider
> > Blog: http://codeclimber.net.nz
> > RSS: http://feeds2.feedburner.com/codeclimber
> > twitter: @simonech
> >
> > Any sufficiently advanced technology is indistinguishable from magic
> > "Life is short, play hard"
> >
> >
> >
>



-- 
Simone Chiaretta
Microsoft MVP ASP.NET - ASPInsider
Blog: http://codeclimber.net.nz
RSS: http://feeds2.feedburner.com/codeclimber
twitter: @simonech

Any sufficiently advanced technology is indistinguishable from magic
"Life is short, play hard"

Reply via email to