Recently I've been trying to reduce memory usage of our website, and basically found that NHibernate was the largest source of memory usage in our system, and as I dug in found some things that I considered to be issues, though I understand the rationale behind them. I thought we might have a discussion about it though and talk about some solutions. I will say that my knowledge of nhibernate's inner workings and reasons why things are is very cursory and limited to what I have discovered/guessed during this process, which is why I would like this to be a discussion.
*Background that can be skipped but explains why this is an issue for us.* The way our system was originally built is that in IIS each customer has their own website (so we are not multi-tenant), and we have a few builds that the customers are on (alpha, beta, stable) that we routinely move them between. Originally we had everyone in their own AppPool (process) but that caused a lot of memory issues because each site would have it's own copy off the code dll's loaded into memory, so we combined people into groups of app domains because IIS will share assemblies that are the same between the sites which now are in separate AppDomains rather than separate processes. This works fairly well, but we are still hitting the top end of the memory for an AppPool about every hour, so that AppPool will get recycled and a new process will be spun up. This kind of sucks because it takes about 10 seconds to initialize the application (yes, we serialize our nhibernate configuration). *The real issue* Now, while poking around in WinDbg, I decided to look at the size of the SessionFactory, which for our system was ~60MB. I also notice that there is about ~30MB of strings in the process, but i'm not sure how many are unique to nhibernate, but we'll just say that it's 20MB (before session factory is built, strings account for ~10MB). Now, when I look at the session factory, it looks like for every type and builds the persisters / select builders / whatever which build up SqlString instances and cache them. All of those strings are built by SqlString class and stored in the parts. The problem is, by keeping those and holding them, you can never free up that system memory. Likely this is so that those never have to be generated again, and I do see that SqlString is immutable so really it all makes sense, just there is the problem that the memory can never be free'd up by the system. This sucks when most things never get used (in our case, we have lots of different parts of our system, but most companies only use a couple). Now the question I have is do all of those strings _need_ to be cached? I understand all the reasons why: they never change so we should just generate once, generating on initialization removes need for locks, etc. At the same time I think, how expensive is it to generate an insert string? When it comes to select strings, how often is that string re-used? With dynamic queries (ie linq or just building up a query over or hql string), are they also stored someplace in which they can't be garbage collected? Another thing I noticed is that some strings could EASILY be interned so that they exist only once where currently there are millions of duplicates. There are strings like ) and , that have lots of instances, but not the same instance, duplicate instances across lots of SqlString instances. If there were an interned version that was used, it would help with the memory situation. On the other hand, these do not make up the majority of memory held by strings, maybe at most 0.5MB, but there are a lot, so interning could help the garbage collector out. *Ideas to help the situation* * * 1. First idea, and the least invasive/problematic would be to intern certain strings like "() ' and or" and use those when building sql strings. This is only really necessary due to keeping SqlString instances around, which pin those strings in memory 2. Don't cache SqlString / SqlCommand in persisters/generators/whatever, cache them in ISession and regenerate them in each session. 3. Cache them in a least-recently-used type cache for which a copy is injected into the ISession and is updated when sessions are disposed / closed (this would imply that some amounts of locking would need to be added, but only if sessions added new queries) Thoughts? Suggestions? I understand that this would constitute large changes, and I understand that likely nothing will come of this, but I do think that these are real issues and worth thinking about and looking out for in future coding as well. As for how much solving this would take, I have no idea, because like I said earlier, I only have a cursory understanding of nhibernate's inner workings.
