RE: Running OutOfMemory while optimizing and searching

James Dunn Tue, 29 Jun 2004 16:29:36 -0700

Mark,

What do your queries look like?  The memory required
for a query can be computed by the following equation:


1 Byte * Number of fields in your query * Number of
docs in your index

So if your query searches on all 50 fields of your 3.5
Million document index then each search would take
about 175MB.  If your 3-4 searches run concurrently
then that's about 525MB to 700MB chewed up at once.

Also, if your queries use wildcards, the memory
requirements could be much greater.  

Hope that helps,

Jim
--- Mark Florence <[EMAIL PROTECTED]> wrote:
> Otis, Thanks for considering this problem.
> 
> I'm using all the default parameters -- and still
> scratching my head!
> 
> I can't see any that would control memory usage.
> Plus, a 2GB heap is
> quite big. I see others have indexes bigger than
> mine, so I'm not sure
> how to find out why mine is throwing OutOfMemory --
> not only on the
> optimize, but when 3-4 searchers are running, too.
> 
> -- Mark
> 
> -----Original Message-----
> From: Otis Gospodnetic
> [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, June 29, 2004 01:02 am
> To: Lucene Users List
> Subject: Re: Running OutOfMemory while optimizing
> and searching
> 
> 
> Mark,
> 
> Tough situation.  I hate when things like this
> happen on production :(.
>  You are not mentioning what you are using for
> various IndexWriter
> parameters.  You may be able to get this working by
> tweaking them (see
>
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWrite
> r.html#field_summary).
> Hm, now that I think about it, I am not sure if
> those are considered
> during index optimization.  I'll try checking the
> sources later.
> 
> Otis
> 
> --- Mark Florence <[EMAIL PROTECTED]> wrote:
> > Hi, I'm using Lucene to index ~3.5M documents,
> over about 50 fields.
> > The Lucene
> > index itself is ~10.5GB, spread over ~7,000 files.
> Some of these
> > files are
> > "large" -- that is, several PRX files are ~1.5GB.
> >
> > Lucene runs on a dedicated server (Linux on a 1Ghz
> Dell, with 1GB
> > RAM). Clients
> > on other machines use RMI to perform reads /
> writes. Each night the
> > server
> > automatically performs an optimize.
> >
> > The problem is that the optimize now dies with an
> OutOfMemory
> > exception, even
> > when the JVM heap size is set to its maximum of
> 2GB. I need to
> > optimize, because
> > as the number of Lucene files grows, search
> performance becomes
> > unacceptable.
> >
> > Search performance is also adversely affected
> because I've had to
> > effectively
> > single-thread reads and writes. I was using a
> simple read / write
> > lock
> > mechanism, allowing multiple readers to
> simultaneously search, but
> > now more than
> > 3-4 simultaneous readers will also cause an
> OutOfMemory condition.
> > Searches can
> > take as long as 30-40 seconds, and with
> single-threading, that's
> > crippling the
> > main client application.
> >
> > Needless to say, the Lucene index is
> mission-critical, and must run
> > 24/7.
> >
> > I've seen other posts along this same vein, but no
> definite
> > consensus. Is my
> > problem simply inadequate hardware? Should I run
> on a 64-bit
> > platform, where I
> > can allocate a Java heap of > 2GB?
> >
> > Or could there be something fundamentally "wrong"
> with my index? I
> > should add
> > that I've just spent about a week (!!) rebuilding
> from scratch, over
> > all 3.5M
> > documents.
> >
> > -- Many thanks for any help! Mark Florence
> >
> >
> >
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> For additional commands, e-mail:
> [EMAIL PROTECTED]
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> For additional commands, e-mail:
> [EMAIL PROTECTED]
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Running OutOfMemory while optimizing and searching

Reply via email to