>>But I'm afraid more detailed information will be needed before this OOM can >>be solved.
I can't see a likely explanation for my particular issue in CachingWrapperFilter but, as a separate issue, it looks to have this potential inefficiency (parallel loading/ attempted caching of the same data). Incidentally, as part of this exercise I did create a "BigFakeIndexReader" to simulate massive indexes using no disk or RAM. You can initialise it with an int representing a desired number of documents and it fakes all the access to termDocs/termEnums etc given a small set of example field values. Calling termDocs.next() introduces delays to simulate the cost of hitting disk - this has been useful for debugging the caching here without access to the live dataset. Worth contributing? Cheers Mark ----- Original Message ---- From: Paul Elschot <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Monday, 18 February, 2008 4:38:10 PM Subject: Re: Out of memory - CachingWrappperFilter and multiple threads Op Monday 18 February 2008 17:08:50 schreef mark harwood: > >>Even though you only have two threads... how many different IndexReader > >>instances do you have? > > One reader - kept open throughout. > > >>I think the only way to make the current version > >>work correctly is to make whole method synchronized on this > >>object, i.e. declare it as public synchronized, and then > >>the synchronized(cache) occurrences can be removed. > > Analysing the pros and cons of such a change... > 1) Synchronisation costs for pre-cached content on the same reader > (hopefully the most common case) would be just as quick as it is now i.e. > swapping a synchronised(this) for synchronised(cache). > 2) Synchronisation for uncached content would avoid reading multiple copies > of the same bitset (one would expect overall faster responses in this > scenario) > 3) However, there would be an additional synchronisation cost for threads > using unrelated readers - a thread using an old reader with pre-cached > content would be delayed (potentially significantly) by a thread caching new > content on a new reader. > > This last one sounds a bit nasty? Yes, that could be problematic. But I'm afraid more detailed information will be needed before this OOM can be solved. Regards, Paul Elschot > > Cheers > Mark > > ----- Original Message ---- > From: Paul Elschot <[EMAIL PROTECTED]> > To: java-dev@lucene.apache.org > Sent: Monday, 18 February, 2008 3:23:01 PM > Subject: Re: Out of memory - CachingWrappperFilter and multiple threads > > I think this code has been discussed before, but I can't find > the occasion on which that happened. > > Actually, this code may do some things like creating two different > caches, and as you said calling filter.bits(reader) in parallel. > > Creating different caches does not hurt because in the end > one only one will survive the assignment, but it's not nice as > different threads might synchronize on different caches > for the first cache.get() call, and I don't know whether > java guarantees that the cache.get() call will synchronize > on the same cache as the cache in the cache.get() > call in this case. > > Doing the filter.bits(reader) call without synchronization > is probably on purpose: it's better not to hold any visible > lock while calling outside an object. > > As it stands, I think the only way to make the current version > work correctly is to make whole method synchronized on this > object, i.e. declare it as public synchronized, and then > the synchronized(cache) occurrences can be removed. > > It might be better to initialize the cache in the constructor, > and then synchronize on the cache while even while > calling filter.bits(reader). This is safe when the cache is private. > > Regards, > Paul Elschot > > > > Op Monday 18 February 2008 13:50:16 schreef mark harwood: > > I'm chasing down a bug in my application where multiple threads were > > readingand caching the same filter (same very common term, big index) and > > causedan Out of Memory exception when I would expect there to be plenty > > ofmemory to spare. > > There's a number of layers to this app to investigate (I was using > > theXMLQueryParser and the CachedFilter tag too) but > > CachingWrapperFilterunderpins all this stuff and I was led to this code in > > it... > > > > public BitSet bits(IndexReader reader) throws IOException { > > if (cache == null) { > > cache = new WeakHashMap(); > > } > > > > synchronized (cache) { // check cache > > BitSet cached = (BitSet) cache.get(reader); > > if (cached != null) { > > return cached; > > } > > } > > > > final BitSet bits = filter.bits(reader); > > > > synchronized (cache) { // update cache > > cache.put(reader, bits); > > } > > > > return bits; > > } > > > > > > The first observation is - why the use of"final" for the variable "bits" ? > > Would there be anyside-effects to this? > > > > Perhaps more worryingly I can see that multiple threads asking for the same > > bitset simultaneously arelikely to unnecessarily read the same data from > > the same reader (butultimately only one bitset should end up cached). My > > app only had 2 simultaneous threads on the same reader so I don't see how > > that accounts for the large memory bloat I saw. In a high traffic > > environment though, I can see multiple requests for a popular term getting > > bottle-necked here creating the same bitset and causing an OOM error. It > > looks like this multiple-load scenario could/should be avoided with some > > careful synchronisation. > > > > Unfortunately I've been unable to reproduce my OOM problem outside of the > > live environment so can't fully pinpoint my particular issue or the > > solution just yet. > > > > Thoughts? > > Mark --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __________________________________________________________ Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]