To provide some additional information, we are running uP 4.0.11 and set org.jasig.portal.portlet.worker.threadPool.maxThreads=200. With about 300 concurrent users on a server, our active JVM threads hit a max of 130 today. Our portlet timeouts generally range from 10 to 15 seconds. Some external RSS feeds benefit from the longer timeouts. Also our Student Center portlet gets a lot of data from Peoplesoft, so it needs more than 5 seconds to avoid unnecessary rendering failures.
Paul -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of James Wennmacher Sent: Wednesday, August 28, 2013 1:55 PM To: [email protected] Cc: [email protected] Subject: Re: [uportal-dev] Proposal to change portal's default render timeout from 5000ms to 20000ms Thanks for passing on a cautionary note and providing this perspective Anthony. This is excellent information. Do you have any statistics on your portlets for average execution time? How many portlets do you have on your pages (in particular, pages with longer-running portlets)? What's the average execution time of the portlets on those pages? Do you have a lot of custom portlets and is internal portlet caching not an option to avoid requests to external systems? Do you have a lot of portlet timeout indications and on what portlets? Are your uPortal server's nearing capacity (how is it doing on JVM heap usage and garbage collection, CPU usage, etc.) There are 150 worker threads which of course are shared for all page render requests so this does provide some indication of worker thread queue sizing. (Separate thought - it would be cool if the worker thread queue defaulted to a configured size but auto-adjusted up to a separate max value and dropped back to a configured min value as needed). Regarding the specific situation you mentioned (and others you are aware of): - what do you mean by brought down the servers? Were user requests for pages immediately failing or queueing up for processing (I'm not sure what the behavior is when you run out of worker threads)? - can you explain in more detail why adjusting one portlet's timeout from 5s to 10s brought down the servers. How many other portlets are on those specific pages and what are their average and peak response times? Is it on the home page, guest pages or authenticated user pages? Etc. Clearly we'll want to really understand the impacts and potential risk scenarios based on your comments. James Wennmacher Unicon 480.558.2420 On 08/28/2013 01:26 PM, Anthony Colebourne wrote: > Hi James, > > In my experience I would recommend extreme caution when raising > portlet timeouts. > > Under heavy load, it is very easy to use up all the portlet rendering > threads. > > Just this morning I accidentally mis-configured a portlet who's time > out is 10000 and brought down our servers. We're pretty strict about > timeout values and almost all of our portlets use the 5s default. > > We in many cases use ajax to load content from long-running processes. > I'm some of these cases I have reluctantly raised to timeout of the > resource requests only. > > I would like to see documented the relationship between rendering > threads and timeout values, also how this relates to tomcat threads > and apache threads where applicable. (I guess db connection pools are > also impacted, both portal and portlet?). > > Some guidance on how to choose sensible thread/timeout values based on > averages such as portlets per page / server resources would be useful. > > I'm happy to provide information and statistics from our production > cluster if it helps? > > -- Anthony. > > > > On 28/08/13 20:08, James Wennmacher wrote: >> I propose we change the portal's default render timeout from 5000ms to >> 20000ms. There are portlets that tend to take long time (such as email >> preview) or custom portlets connecting to back end systems where 5000ms >> is sometimes too short a time. >> >> I think the user experience in general would be better to have a longer >> default so: >> - The portal doesn't display an unpleasant message to the user for >> longer-running scenarios >> - The portal is more tolerant of operational issues such as uPortal or >> dependent systems running a bit slow >> - Longer-running processes should typically use ajax requests to obtain >> the data for a better user experience, so the urgent user response is >> less important in these situations since the entire UI is not impacted >> >> This would affect new portlets that are created via the UI (or imported >> without a timeout value I believe) but not existing portlet instances. >> >> Thoughts? -- You are currently subscribed to [email protected] as: [email protected] To unsubscribe, change settings or access archives, see http://www.ja-sig.org/wiki/display/JSG/uportal-dev -- You are currently subscribed to [email protected] as: [email protected] To unsubscribe, change settings or access archives, see http://www.ja-sig.org/wiki/display/JSG/uportal-dev
