Re: Running OutOfMemory while optimizing and searching

David Spencer Fri, 02 Jul 2004 11:50:46 -0700

This in theory should not help, but anyway, just in case, the idea is to call gc() periodically to "force" gc - this is the code I use which tries to force it...


public static long gc()
        {
                long bef = mem();
                System.gc();
                sleep( 100);
                System.runFinalization();
                sleep( 100);
                System.gc();
                long aft= mem();
                return aft-bef;
        }

Mark Florence wrote:

Thanks, Jim. I'm pretty sure I'm throwing OOM for real,
and not because I've run out of file handles. I can easily
recreate the latter condition, and it is always reported
accurately. I've also monitored the OOM as it occurs using
"top" and I can see memory usage climbing until it is
exhausted -- if you will excuse the pun!
I'm not familiar with the new compound file format. Where
can I look to find more information?
-- Mark
-----Original Message-----
From: James Dunn [mailto:[EMAIL PROTECTED]
Sent: Friday, July 02, 2004 01:29 pm
To: Lucene Users List
Subject: Re: Running OutOfMemory while optimizing and searching
Ah yes, I don't think I made that clear enough.  From
Mark's original post, I believe he mentioned that he
used seperate readers for each simultaneous query.
His other issue was that he was getting an OOM during an optimize, even when he set the JVM heap to 2GB. He said his index was about 10.5GB spread over ~7000 files on Linux.

My guess is that OOM might actually be a "too many open files" error. I have seen that type of error being reported by the JVM as an OutOfMemory error on Linux before. I had the same problem but once I switched to the new Lucene compound file format, I haven't had that problem since.

Mark, have you tried switching to the compound file format?
Jim
--- Doug Cutting <[EMAIL PROTECTED]> wrote:
> What do your queries look like?  The memory
required
> for a query can be computed by the following
equation:
>
> 1 Byte * Number of fields in your query * Number
of
> docs in your index
>
> So if your query searches on all 50 fields of
your 3.5
> Million document index then each search would
take
> about 175MB.  If your 3-4 searches run
concurrently
> then that's about 525MB to 700MB chewed up at
once.
That's not quite right. If you use the same IndexSearcher (or IndexReader) for all of the searches, then only 175MB are used. The arrays in question (the norms) are read-only and can be shared by all searches.
In general, the amount of memory required is:
1 byte * Number of searchable fields in your index * Number of docs in your index
plus
1k bytes * number of terms in query
plus
1k bytes * number of phrase terms in query
The latter are for i/o buffers. There are a few other things, but these are the major ones.
Doug
---------------------------------------------------------------------
To unsubscribe, e-mail:
[EMAIL PROTECTED]
For additional commands, e-mail:
[EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Running OutOfMemory while optimizing and searching

Reply via email to