Hi devs,

Yesterday xwiki.org crashed and I had configured it to take a heap dump. I’ve 
done a quick analysis that I’m sharing here (I’ll continue to analyse):

Memory retained: 1GB

Main contenders:

1) Document cache: 178MB
2) Lucene WeightedSpanTermExtractor: 166MB
3) IRCBot Threads: 165MB
4) Velocity RuntimeInstance: 38MB
5) SOLR LRUCache (Lucene Document): 38MB
6) EM DefaultCoreExtensionRepository: 38MB
7) NamespaceURLClassLoader: 23MB

I’ve started analyzing some below.

1 - Document Cache Analysis
=======================

* There are 3552 XWikiDocument in memory for 195MB
* The document cache size is 2000 on xwiki.org
* Large documents (such as Test Reports) take 6MB each (XDOM caching)
* So if we had only large documents in the wiki the cache would need 2000*6MB = 
12TO
* I don’t think this cache is memory aware, meaning it doesn’t free its entries 
when memory is low
* 178MB for 2000 docs means average of 89K per document. Huge variation between 
docs with big content and docs with no or small content

This means that when memory is low on xwiki.org it should be enough to call a 
few pages with some large content to get an OOM.

4 ideas to explore:

Idea 1: Use a cache that evicts entries when some max threshold is reached
** Infinispan doesn’t support this yet: 
https://issues.jboss.org/browse/ISPN-863 and 
https://community.jboss.org/thread/165951?start=0&tstart=0
** Guava seems to support size-based eviction with the ability to pass a weight 
function: 
http://code.google.com/p/guava-libraries/wiki/CachesExplained#Size-based_Eviction

Idea 2: Another idea is a usage of a distributed cache such as memcached or 
elasticsearch. I wonder if the overhead of the network communication is too 
high to make it interesting vs not caching the XDOM and rendering it every time 
it’s needed.

Idea 3: Try to reduce even more how the XDOM is stored in memory

Idea 4: Don’t cache the XDOM and render every time and use a title cache for 
titles. Also do that for getting sections. I think they are the 2 main uses 
cases for getting the XDOM.

As a short term action, I’d recommend to immediately reduce the document cache 
size from 2000 to 1000 on wiki.org or double the heap memory.

2 - Lucene WeightedSpanTermExtractor Analysis
=====================================

I’m not sure what this is about yet but it looks strange.

* There is 166MB stored in the Map<String,AtomicReaderContext> of 
WeightedSpanTermExtractor.
* That map contains 192 instances
* Example of map items: “doccontent_pt” (2.4MB), “title_ru” (1.8MB), “title_ro” 
(1.8MB), etc

Any idea Marius?

3 - IRCBot Analysis
===============

* We use 3 IRCBot threads. They take 55MB each!
* The 55MB is taken by the ExecutionContext
* More precisely the 55MB are held in 77371 
org.apache.velocity.runtime.parser.node.Node[] objects

I need to understand more why it’s so large since it doesn’t look normal.

I also wonder if it keeps increasing or not.

5 - SOLR LRUCache Analysis
=======================

* It’s map of 512 entries (Lucene Document objects). 512 is the cache size.
* Entries are instances of DocSlice

Looks ok and normal.

6 - EM DefaultCoreExtensionRepository Analysis
======================================

* 38MB in "Map<String, DefaultCoreExtension> extensions”
* 33MB in org.codehaus.plexus.util.xml.Xpp3Dom instances (44844 instances) 
which I guess corresponds to the pom.xml of all our core extensions mostly.

Looks normal even though 33MB is quite a lot.

7 - NamespaceURLClassLoader Analysis
================================

* 23MB in org.eclipse.jgit.storage.file.WindowCache
* So this seems related to the XWiki Git Module used by the GitHubStats 
application installed on dev.xwiki.org

This looks ok and normal according to 
http://download.eclipse.org/jgit/docs/jgit-2.0.0.201206130900-r/apidocs/org/eclipse/jgit/storage/file/WindowCache.html

Thanks
-Vincent




_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

Reply via email to