Mark, What do your queries look like? The memory required for a query can be computed by the following equation:
1 Byte * Number of fields in your query * Number of docs in your index So if your query searches on all 50 fields of your 3.5 Million document index then each search would take about 175MB. If your 3-4 searches run concurrently then that's about 525MB to 700MB chewed up at once. Also, if your queries use wildcards, the memory requirements could be much greater. Hope that helps, Jim --- Mark Florence <[EMAIL PROTECTED]> wrote: > Otis, Thanks for considering this problem. > > I'm using all the default parameters -- and still > scratching my head! > > I can't see any that would control memory usage. > Plus, a 2GB heap is > quite big. I see others have indexes bigger than > mine, so I'm not sure > how to find out why mine is throwing OutOfMemory -- > not only on the > optimize, but when 3-4 searchers are running, too. > > -- Mark > > -----Original Message----- > From: Otis Gospodnetic > [mailto:[EMAIL PROTECTED] > Sent: Tuesday, June 29, 2004 01:02 am > To: Lucene Users List > Subject: Re: Running OutOfMemory while optimizing > and searching > > > Mark, > > Tough situation. I hate when things like this > happen on production :(. > You are not mentioning what you are using for > various IndexWriter > parameters. You may be able to get this working by > tweaking them (see > http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWrite > r.html#field_summary). > Hm, now that I think about it, I am not sure if > those are considered > during index optimization. I'll try checking the > sources later. > > Otis > > --- Mark Florence <[EMAIL PROTECTED]> wrote: > > Hi, I'm using Lucene to index ~3.5M documents, > over about 50 fields. > > The Lucene > > index itself is ~10.5GB, spread over ~7,000 files. > Some of these > > files are > > "large" -- that is, several PRX files are ~1.5GB. > > > > Lucene runs on a dedicated server (Linux on a 1Ghz > Dell, with 1GB > > RAM). Clients > > on other machines use RMI to perform reads / > writes. Each night the > > server > > automatically performs an optimize. > > > > The problem is that the optimize now dies with an > OutOfMemory > > exception, even > > when the JVM heap size is set to its maximum of > 2GB. I need to > > optimize, because > > as the number of Lucene files grows, search > performance becomes > > unacceptable. > > > > Search performance is also adversely affected > because I've had to > > effectively > > single-thread reads and writes. I was using a > simple read / write > > lock > > mechanism, allowing multiple readers to > simultaneously search, but > > now more than > > 3-4 simultaneous readers will also cause an > OutOfMemory condition. > > Searches can > > take as long as 30-40 seconds, and with > single-threading, that's > > crippling the > > main client application. > > > > Needless to say, the Lucene index is > mission-critical, and must run > > 24/7. > > > > I've seen other posts along this same vein, but no > definite > > consensus. Is my > > problem simply inadequate hardware? Should I run > on a 64-bit > > platform, where I > > can allocate a Java heap of > 2GB? > > > > Or could there be something fundamentally "wrong" > with my index? I > > should add > > that I've just spent about a week (!!) rebuilding > from scratch, over > > all 3.5M > > documents. > > > > -- Many thanks for any help! Mark Florence > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
