Hey Phil Thanks for responding.
So, in your opinion, maxWaitMillis is optimally configured at 5 seconds as opposed to the default indefinite? Thanks for your input again and pointing to those resources. I think clearly I need to spend more time researching this. On Fri, Jan 1, 2021 at 11:22 PM Phil Steitz <phil.ste...@gmail.com> wrote: > > > On 12/20/20 10:26 PM, Hrafn Malmquist wrote: > > Hi Gary > > > > Thanks for taking the time to respond. > > > > I hope you can bear with me as I am still learning about database > > connection pooling. > > > > Perhaps I did not ask the question correctly. I am not asking about a > site > > specific setup but rather what defaults should be shipped with the > > software. I am part of the minor version release team. > > > > Currently, the default setup is a DBCP2 v. 2.1.1 connection pool with > > only maxWaitMillis, > > maxIdle and maxTotal configurable in the DSpace configuration settings > and > > the default values for these settings set to 5000, 10 and 30 > respectively. > > It's unclear why these defaults were chosen to begin with, git blame > shows > > they were chosen back in 2015. I don't think a lot of thought went into > > choosing 1) which parameters should be configurable nor 2) what their > > defaults should be (or why they should differ from DBCP2 defaults). > > > > DSpace repositories are run by higher education institutions and all > sorts > > of institutions and organisations involved in research, for instance the > > Smithsonian (https://repository.si.edu/). Therefore, although the vast > > majority of instances are run by small institutions that get little > > traffic, others are likely to receive relatively heavy traffic, from > users > > and crawlers. > > > > So the idea is to ask the experts what parameters should be configurable > > for the average repository admin, keeping in mind that the aim is for > > installation and setup to be simple (in effect, what are the "main" > > parameters likely to need tweaking) and what should the out-of-the-box > > defaults be (if at all different from the DBCP2 defaults). > > > > I am particularly surprised at the low maxWaitMillis chosen. Is that not > > likely to cause problems for high traffic sites? > > I would say no. Having threads blocked waiting for connections for > longer than 5 seconds will likely cause problems in heavily loaded > applications. You will end up running out of app server processing > threads if they are hanging for that long. If getConnection is taking > that long, there is likely a problem somewhere in the overall system - > processing threads holding connections too long, not enough connections, > database latency, etc. It all comes down to queuing theory. If your > app does not hold connections long and queries are optimized, even a > relatively small pool can handle decent load. The key is to not to > leave connections open or hold on to them too long. > > The defaults above look OK to me, though if database connections are not > in short supply, I would bump maxIdle to 20. The reason for this is > that setting it at 10 means that if the number used regularly goes up to > 20+, you will end up with a lot of connection churn. On the other hand, > if the usage pattern is spikes now and then followed by long periods of > lighter load, setting it at 20 will "waste" some connections. How > important that "waste" is depends on what else is going on in the DB, > how many pools are sharing it, etc. > > I would recommend upgrading to the latest version compatible with the > version of tc you are running, or simply using the version that ships > with tomcat (which is generally the latest compatible). Another reason > to upgrade dbcp if you are using it directly is to pick up the fixes in > the later version of commons pool that it brings in. > > For some general info on how dbcp and pool configs work, see [1]. It is > old, but the basic concepts are still correct. If you are familiar with > queuing theory, you can view a pool with n connections as a M/M/n > queue. What drives everything is request arrival rate and service time, > which in the case of dbcp is how long an application thread holds a > connection. You can observe actual utilization using the JMX interfaces. > > Phil > > [1] https://www.slideshare.net/psteitz/apachecon2014-pooldbcp > > > > Best regards, Hrafn > > > > > > [1] : > > > https://github.com/DSpace/DSpace/blob/250c87dc1604c34e2a963b6804163c73278e9ff7/dspace/config/spring/api/core-hibernate.xml#L41-L48 > > > > [2] : > > > https://github.com/DSpace/DSpace/blob/250c87dc1604c34e2a963b6804163c73278e9ff7/dspace/config/dspace.cfg#L77-L86 > > > > On Sun, Dec 20, 2020 at 6:40 PM Gary Gregory <garydgreg...@gmail.com> > wrote: > > > >> Hi, > >> > >> Each new DBCP release brings fixes, additions, and other updates, as > you > >> can read in the release notes. > >> > >> How to best configure DBCP for any given combination of JDBC driver, its > >> database, and application will be quite variable, which is somewhat out > of > >> scope here IMO. > >> > >> Gary > >> > >> On Fri, Dec 18, 2020, 11:15 Hrafn Malmquist <hrafn.malmqu...@gmail.com> > >> wrote: > >> > >>> Good day > >>> > >>> I'm wondering what are optimal defaults for DSpace, open source digital > >>> repository software aimed especially at academic, non-profit, and > >>> commercial organizations (see https://duraspace.org/dspace/). > >>> > >>> DSpace supports both Postgres and Oracle and recommends Tomcat, Jetty > or > >>> Caucho Resin. I suspect 9/10 installations use Tomcat. > >>> > >>> DSpace comes packaged with Apache Commons DCBP 2.1.1. DSpace only > >>> configures three configurations for DBCP2 using non-default settings. > >> (see: > >>> [1] and [2]) > >>> > >>> These are > >>> maxTotal = 30 > >>> maxIdle = 10 > >>> maxWaitMillis = 5000 > >>> > >>> I am not sure what reasoning is behind the choice of these > configuration > >>> settings. DSpace is used by all sorts of institutions, some receiving > >> very > >>> high traffic. My guess is that using the DBCP2 defaults is recommended. > >> My > >>> question is, is this a good default configuration? Should there be more > >>> configuration configurable by DSpace users in the DSpace config? There > >> have > >>> been reports of the database not being reachable because of too many > idle > >>> connections. According to one doc [3] maxWaitMillis should be at a > >>> minimum of 10000 ms if I understand correctly. > >>> > >>> Also, I assume there are benefits to upgrading the DBCP2 dependency to > >> the > >>> most recent version, 2.8.0. I'm not sure what the major benefits are > >>> though. I can see v. 2.5.0 only runs on Java 8. > >>> > >>> [1] - > >>> > >>> > >> > https://github.com/DSpace/DSpace/blob/755f0732aeea7dd1449830593caa54d77890e5bd/dspace/config/local.cfg.EXAMPLE#L88-L99 > >>> [2] - > >>> > >>> > >> > https://github.com/DSpace/DSpace/blob/755f0732aeea7dd1449830593caa54d77890e5bd/dspace/config/spring/api/core-hibernate.xml#L46-L48 > >>> [3] - > >>> > >>> > >> > https://tomcat.apache.org/tomcat-8.0-doc/jndi-datasource-examples-howto.html#Intermittent_Database_Connection_Failures > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > For additional commands, e-mail: user-h...@commons.apache.org > >