Hi,

I did further investigation (with jvisualvm - you can use any version, also the 
newest one with other bitness, it can always read the heap dump - I recommend 
the Java 7 64bit one, its most fancy and does not itself OOM): 

> When looking at the MBean mess, it looks like:
> The whole VM is filled with MBean statistics (20% of the total heap!!!), just
> for statistics. It looks like the MBean server is not shut down correctly when
> the Solr instance shuts down, so it sums up while running tests, every new
> Solr instance adds new statistics to the huge MBean maps eating all the heap
> (and possibly permgen, because most strings may be interned)! This is a
> huge leak, we should fix this (or disable the whole useless MBean shit
> completely, at least for tests). Was this strange, never-seen package
> com.yammer.metrics introduced recently related to mbeans - or is
> zookeeper the bad guy?

It's much worse: the String instances are only 20% of heap, but 26% are used 
for the ConcurrentHashMap.Entry classes holding those references and tons of 
ConcurrentHashMaps and com.yammer.metrics.core instances, eating up 60% of the 
total heap space (only reachable object, not those to be GCed).

The big question: Do we need com.yammer.metrics.core (it is 
metrics-core-2.1.2.jar in solr/core/lib) at all? When was it introduced? Lucene 
3.6 does not have it, neither Solr 4.0. It must be introduced recently - and 
eats up all memory.

Uwe

> > -----Original Message-----
> > From: Mark Miller [mailto:[email protected]]
> > Sent: Wednesday, December 26, 2012 3:22 AM
> > To: [email protected]
> > Subject: Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.6.0_37) -
> > Build #
> > 3421 - Failure!
> >
> > Is this one a nightly build?
> >
> > I can run it and look at it closely tomorrow.
> >
> > - Mark
> >
> > On Dec 25, 2012, at 6:04 PM, Uwe Schindler <[email protected]> wrote:
> >
> > > Can we add a finally/try block that catches permgen errors and calls
> > System.halt (not exit)? I could add another extra allowance to the
> > security manager, disallowing exits.
> > >
> > > But we should try to find the issue in the tests, maybe Mark has an idea.
> > We have the heap dump readily available, but I don't have the tools to
> > inspect it.
> > >
> > > Uwe
> > >
> > >
> > >
> > > Dawid Weiss <[email protected]> schrieb:
> > > > the test framework crashes somehow and does not respond anymore.
> > >
> > > I think I know exactly how it crashes -- there's not much mystery
> > > about this: once the permgen is exhausted OOM errors are thrown from
> > > tests; what happens then is these errors are caught and an attempt
> > > is made to serialize these errors to the master node. Unfortunately
> > > this process involves loading some classes that are not yet loaded
> > > and, since the permgen is already exhausted, everything goes insane
> > > (the thread apparently just silently quits; there are finally blocks
> > > that are never reached).
> > >
> > > Like I said -- I'll see what I can do about it but I don't have any
> > > optimistic feelings. This is really riding a critical edge and short
> > > of preallocating static data structures I don't see any way of
> > > implementing a clean solution for the problem.
> > >
> > > Dawid
> > >
> > >
> > > To unsubscribe, e-mail: [email protected] For
> > > additional commands, e-mail: [email protected]
> > >
> > >
> > > --
> > > Uwe Schindler
> > > H.-H.-Meier-Allee 63, 28213 Bremen
> > > http://www.thetaphi.de
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected] For
> > additional commands, e-mail: [email protected]
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected] For additional
> commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to