Hey Phil

Thanks for responding.

So, in your opinion, maxWaitMillis is optimally configured at 5 seconds as
opposed to the default indefinite?

Thanks for your input again and pointing to those resources.

I think clearly I need to spend more time researching this.




On Fri, Jan 1, 2021 at 11:22 PM Phil Steitz <phil.ste...@gmail.com> wrote:

>
>
> On 12/20/20 10:26 PM, Hrafn Malmquist wrote:
> > Hi Gary
> >
> > Thanks for taking the time to respond.
> >
> > I hope you can bear with me as I am still learning about database
> > connection pooling.
> >
> > Perhaps I did not ask the question correctly. I am not asking about a
> site
> > specific setup but rather what defaults should be shipped with the
> > software. I am part of the minor version release team.
> >
> > Currently, the default setup is a DBCP2 v. 2.1.1 connection pool with
> > only maxWaitMillis,
> > maxIdle and maxTotal configurable in the DSpace configuration settings
> and
> > the default values for these settings set to 5000, 10 and 30
> respectively.
> > It's unclear why these defaults were chosen to begin with, git blame
> shows
> > they were chosen back in 2015. I don't think a lot of thought went into
> > choosing 1) which parameters should be configurable nor 2) what their
> > defaults should be (or why they should differ from DBCP2 defaults).
> >
> > DSpace repositories are run by higher education institutions and all
> sorts
> > of institutions and organisations involved in research, for instance the
> > Smithsonian (https://repository.si.edu/). Therefore, although the vast
> > majority of instances are run by small institutions that get little
> > traffic, others are likely to receive relatively heavy traffic, from
> users
> > and crawlers.
> >
> > So the idea is to ask the experts what parameters should be configurable
> > for the average repository admin, keeping in mind that the aim is for
> > installation and setup to be simple (in effect, what are the "main"
> > parameters likely to need tweaking) and what should the out-of-the-box
> > defaults be (if at all different from the DBCP2 defaults).
> >
> > I am particularly surprised at the low maxWaitMillis chosen. Is that not
> > likely to cause problems for high traffic sites?
>
> I would say no.  Having threads blocked waiting for connections for
> longer than 5 seconds will likely cause problems in heavily loaded
> applications.  You will end up running out of app server processing
> threads if they are hanging for that long.   If getConnection is taking
> that long, there is likely a problem somewhere in the overall system -
> processing threads holding connections too long, not enough connections,
> database latency, etc.  It all comes down to queuing theory.  If your
> app does not hold connections long and queries are optimized, even a
> relatively small pool can handle decent load.  The key is to not to
> leave connections open or hold on to them too long.
>
> The defaults above look OK to me, though if database connections are not
> in short supply, I would bump maxIdle to 20.  The reason for this is
> that setting it at 10 means that if the number used regularly goes up to
> 20+, you will end up with a lot of connection churn.  On the other hand,
> if the usage pattern is spikes now and then followed by long periods of
> lighter load, setting it at 20 will "waste" some connections.  How
> important that "waste" is depends on what else is going on in the DB,
> how many pools are sharing it, etc.
>
> I would recommend upgrading to the latest version compatible with the
> version of tc you are running, or simply using the version that ships
> with tomcat (which is generally the latest compatible). Another reason
> to upgrade dbcp if you are using it directly is to pick up the fixes in
> the later version of commons pool that it brings in.
>
> For some general info on how dbcp and pool configs work, see [1]. It is
> old, but the basic concepts are still correct.  If you are familiar with
> queuing theory, you can view a pool with n connections as a M/M/n
> queue.  What drives everything is request arrival rate and service time,
> which in the case of dbcp is how long an application thread holds a
> connection.   You can observe actual utilization using the JMX interfaces.
>
> Phil
>
> [1] https://www.slideshare.net/psteitz/apachecon2014-pooldbcp
> >
> > Best regards, Hrafn
> >
> >
> > [1] :
> >
> https://github.com/DSpace/DSpace/blob/250c87dc1604c34e2a963b6804163c73278e9ff7/dspace/config/spring/api/core-hibernate.xml#L41-L48
> >
> > [2] :
> >
> https://github.com/DSpace/DSpace/blob/250c87dc1604c34e2a963b6804163c73278e9ff7/dspace/config/dspace.cfg#L77-L86
> >
> > On Sun, Dec 20, 2020 at 6:40 PM Gary Gregory <garydgreg...@gmail.com>
> wrote:
> >
> >> Hi,
> >>
> >> Each new DBCP release brings fixes, additions,  and other updates, as
> you
> >> can read in the release notes.
> >>
> >> How to best configure DBCP for any given combination of JDBC driver, its
> >> database, and application will be quite variable, which is somewhat out
> of
> >> scope here IMO.
> >>
> >> Gary
> >>
> >> On Fri, Dec 18, 2020, 11:15 Hrafn Malmquist <hrafn.malmqu...@gmail.com>
> >> wrote:
> >>
> >>> Good day
> >>>
> >>> I'm wondering what are optimal defaults for DSpace, open source digital
> >>> repository software aimed especially at  academic, non-profit, and
> >>> commercial organizations (see https://duraspace.org/dspace/).
> >>>
> >>> DSpace supports both Postgres and Oracle and recommends Tomcat, Jetty
> or
> >>> Caucho Resin. I suspect 9/10 installations use Tomcat.
> >>>
> >>> DSpace comes packaged with Apache Commons DCBP 2.1.1. DSpace only
> >>> configures three configurations for DBCP2 using non-default settings.
> >> (see:
> >>> [1] and [2])
> >>>
> >>> These are
> >>> maxTotal = 30
> >>> maxIdle = 10
> >>> maxWaitMillis = 5000
> >>>
> >>> I am not sure what reasoning is behind the choice of these
> configuration
> >>> settings. DSpace is used by all sorts of institutions, some receiving
> >> very
> >>> high traffic. My guess is that using the DBCP2 defaults is recommended.
> >> My
> >>> question is, is this a good default configuration? Should there be more
> >>> configuration configurable by DSpace users in the DSpace config? There
> >> have
> >>> been reports of the database not being reachable because of too many
> idle
> >>> connections. According to one doc [3] maxWaitMillis should be at a
> >>> minimum of 10000 ms if I understand correctly.
> >>>
> >>> Also, I assume there are benefits to upgrading the DBCP2 dependency to
> >> the
> >>> most recent version, 2.8.0. I'm not sure what the major benefits are
> >>> though. I can see v. 2.5.0 only runs on Java 8.
> >>>
> >>> [1] -
> >>>
> >>>
> >>
> https://github.com/DSpace/DSpace/blob/755f0732aeea7dd1449830593caa54d77890e5bd/dspace/config/local.cfg.EXAMPLE#L88-L99
> >>> [2] -
> >>>
> >>>
> >>
> https://github.com/DSpace/DSpace/blob/755f0732aeea7dd1449830593caa54d77890e5bd/dspace/config/spring/api/core-hibernate.xml#L46-L48
> >>> [3] -
> >>>
> >>>
> >>
> https://tomcat.apache.org/tomcat-8.0-doc/jndi-datasource-examples-howto.html#Intermittent_Database_Connection_Failures
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
> For additional commands, e-mail: user-h...@commons.apache.org
>
>

Reply via email to