Sounds reasonable on the maxWait increase. How about 10 sec? Written up as https://issues.jasig.org/browse/UP-4325. I'll try to get to it later this month to give others a chance to comment on it.
James Wennmacher - Unicon 480.558.2420 On 12/02/2014 12:38 PM, Tim Levett wrote: > > Random collection of thoughts on this topic: > > > * Raw events would need more than one connection. Thereare3 major > interactions (might be more) with UP_RAW_EVENTS: Insert for new > events,read/update for aggregation, and purging. > * This pool is pernode,so if you are clustering (like our 4 nodes) > this can get you into trouble if you have limits on your > concurrentdatabase connections. > * It looks like we use the default pool size here on MyUW.I don't > have access to theadmin interface in production. > * The only time we had database connection issues is when we had a > databasehiccup(single point of failure) and a bunch of requests > queued up. > * If we do modifications, I would also suggest upping the wait for > connection timeout. > > > Thanks, > > > Tim Levett > tim.levettATwisc.edu > MyUW-Infrastructure > > > ------------------------------------------------------------------------ > *From:* [email protected] > <[email protected]> on behalf of James > Wennmacher <[email protected]> > *Sent:* Tuesday, December 2, 2014 1:02 PM > *To:* [email protected] > *Subject:* [uportal-dev] uPortal database connection pool size > Answering the question below on the uportal-user list got me thinking > (a dangerous thing indeed ... :-) ). > > Currently all 3 of the uPortal DB connection pool sizes defined in > datasourceContext.xml are all set to the same max value (of 75 by > default). In glancing at the code I am thinking that the raw events > DB pool and the aggregation events DB pool are running on timed > threads and only use 1 DB connection each (see > https://github.com/Jasig/uPortal/blob/uportal-4.1.2/uportal-war/src/main/java/org/jasig/portal/events/handlers/QueueingEventHandler.java#L41 > > and > https://github.com/Jasig/uPortal/blob/uportal-4.1.2/uportal-war/src/main/resources/properties/contexts/schedulerContext.xml#L73). > > They can both have a smaller maxActive value to limit the exposure of > an error somehow consuming large numbers of DB connections and > impacting uPortal and portlets (via consuming too many DB connections > on the DB server). I was thinking of setting their maxActive value to > 5 to allow simultaneous threads for saving, purging, and querying. > > Does anyone see a problem with this strategy? UWMadison or someone > with an active system can you glance at the DB MBeans AggrEventsDB and > RawEventsDB in uPortal/Datasource and see if the NumActive + NumIdle > are even close to 5 on your system? (unfortunately without monitoring > tools I don't see how you'd find out what the max # of connections > ever made was). > > Thanks in advance for your insights and thoughts. > > James Wennmacher - Unicon > 480.558.2420 > > > -------- Forwarded Message -------- > Subject: Re: [uportal-user] Increase database connection pool size > Date: Tue, 02 Dec 2014 11:02:02 -0700 > From: James Wennmacher <[email protected]> > To: [email protected] > > > > Database connection counts are defined in > uportal-war/src/main/resources/properties/contexts/datasourceContext.xml. > > uPortal uses the DB connections for a fairly brief period of time. > The message 'none available[size:75; busy:0; idle:0; lastwait:5000]' > plus your comment about leaving it overnight makes me wonder if > somehow the connections are being lost and not reclaimed. I suggest: > > 1. Insure that the load test is not hitting servers too heavily; e.g. > load is distributed evenly. I could see running out of DB connections > happening if a server gets hammered (though the connections should be > freed up at some point later). Does it happen primarily to one or two > servers and not all of them? > > 2. Try adding the following properties to the basePooledDataSource > bean in datasourceContext.xml: > > <property name="logAbandoned" value="true" /> > <property name="numTestsPerEvictionRun" value="5" /> > > This may not resolve the issue, but perhaps the logging will provide a > clue to what's going on. However it is likely the additional logging > will not trigger. The property minEvictableIdleTimeMillis is supposed > to release a connection after it has been idle for the specified > number of milliseconds, and the properties abandonWhenPercentageFull, > removeAbandoned, and removeAbandonedTimeout which are specified are > supposed to clean up abandon connections (allocated but not used in > removeAbandonedTimeout seconds when a new connection is requested but > none are available). However in a load test scenario, especially one > where a server is taxed very heavily, the removeAbandonedTimeout value > may be too high (value is 300 sec) if connections are heavily used so > no connections may be considered abandoned and harvested during the > test. However I wouldn't change removeAbandonedTimeout just yet. > After the test completes however there may be some useful log messages > if some connections were consumed and not released. If nothing else 5 > minutes after the test completes you should be able to log onto the > server even if all connections are consumed since it should consider > at least some of the connections as abandoned and eligible for > harvesting. However your comment about leaving the system overnight > and the issue still exists makes me think the abandon connections will > not be harvested. Still worth trying logAbandoned to see if it > provides more info. > > 3. Are there other DB connection error messages? The ones you > mentioned are for event aggregation (runs periodically to aggregate > and purge raw portal activity event data) and for jgroups (used for > distributed cache management to allow uPortal nodes to notify other > uPortal nodes about cache replication or invalidation). Were there any > for uPortal activity not having a database connection? > > 4. It would be great to get more information so we can try to fix the > issue of the connections not being released. Even when fully > consumed, the connections should release after a period of time (after > being idle for minEvictableIdleTimeMillis milliseconds, or 5 minutes > at the latest per removeAbandonedTimeout property when attempting to > get a new connection and none are available). When this situation > occurs are you able to look at the DB server and see if the DB sees > the 75 connections from the failed uPortal server, if the DB thinks > the connections are idle, and what the last SQL command was on each of > the DB connections? It is also possible that network issues between > the uPortal server and DB server are causing network socket > connections to hang. Finding out if the DB server is aware of the > connections, their state, and what the last activity was should help > determine if that is the case and hopefully point us to where the > issue is. > > 5. Barring additional investigation above (which I'd really like to > have investigated and addressed), if you decide to try and increase > the DB connections you'll want to discuss with your DBA. Each uPortal > server will make the up to 3 times the specified number of connections > (75 for uPortal app use, 75 for raw event storage, 75 for event > aggregation), plus some portlets (newsreader, announcements, simple > content portlet, calendar, bookmarks) have separate db connection > pools or make DB connections on each request that will go to the same > DB. If you have 10 servers, assuming they each make 75 + another 30 > to 50 connections for the portlets (making a guess at max portlet > connections), the max calculation would be your DB server would need > to have resources to handle 10 * (3 * 75 + 30) DB connections. As a > note I don't think that the raw events or the aggregation events DB > pools are likely to use 75 DB connections each as I think they are > threads that run periodically on a timer and would use only 1 or 2 DB > connections each (barring some software fault) even though their pool > sizes are a max of 75. If I'm right the real calculations would be > more like 10 * (75 + 2 + 30), though it is likely the portlets would > be less likely to max out their connections unless they are all on the > main landing page or otherwise close together in the page flow. > > In light of above it is possible that part of what is going on is that > uPortal is attempting to request a DB connection, but the DB server is > maxed out and it rejects the open. I'm not sure if that's what is > going on but it is worth investigating. > > I hope this helps, and please let us know what you find out. > > Thanks, > James Wennmacher - Unicon > 480.558.2420 > On 12/02/2014 08:46 AM, Ryan Melissari wrote: >> I am in the process of load testing uPortal and am running out of >> database connections. I have looked and don't see where I would >> increase this. From the log file it is set to 75...does anyone know >> what a good number to increase this to would be? Also, it seems that >> once it uses all the connections, it never releases them. I have left >> it overnight and it never gives them back, forcing me to restart >> tomcat. Is there a way to set a max wait as well? Here is the error >> I am getting in the portal.log: >> >> INFO [uP-TaskExec-7-aggregateRawEvents] >> o.h.e.i.DefaultLoadEventListener 2014-12-02 09:27:54,147 - HHH000327: >> Error performing load command : >> org.hibernate.exception.GenericJDBCException: Could not open connection >> ERROR [Timer-5,uPortal.cacheManager,htst2web1-56682] >> o.j.p.jgroups.protocols.DAO_PING 2014-12-02 09:27:56,126 - failed >> sending discovery request >> org.springframework.jdbc.CannotGetJdbcConnectionException: Could not >> get JDBC Connection; nested exception is >> org.apache.tomcat.jdbc.pool.PoolExhaustedException: >> [Timer-5,uPortal.cacheManager,htst2web1-56682] Timeout: Pool empty. >> Unable to fetch a connection in 5 seconds, none available[size:75; >> busy:0; idle:0; lastwait:5000]. >> >> >> -- >> >> You are currently subscribed [email protected] >> as:[email protected] >> To unsubscribe, change settings or access archives, >> seehttp://www.ja-sig.org/wiki/display/JSG/uportal-user > > > > -- > > You are currently subscribed to [email protected] as: > [email protected] > To unsubscribe, change settings or access archives, see > http://www.ja-sig.org/wiki/display/JSG/uportal-dev > -- > > You are currently subscribed to [email protected] as: > [email protected] > To unsubscribe, change settings or access archives, see > http://www.ja-sig.org/wiki/display/JSG/uportal-dev -- You are currently subscribed to [email protected] as: [email protected] To unsubscribe, change settings or access archives, see http://www.ja-sig.org/wiki/display/JSG/uportal-dev
