Hello all,

As watchers of the Evergreen IRC channel, know, the PINES upgrade from 
Evergreen 2.7.2/PostgreSQL 9.3 to Evergreen 2.9.1/PostgreSQL 9.4 was a bit 
bumpy.  Our system slowed to a crawl on go-live day, and we were quick to 
determine that the culprit was the database.  After consulting with Mike 
Rylander of Equinox Software, we implemented several tweaks to our system that 
I'll share here in hopes that they may help others in the future:

The slowness appears to have been due to two problem areas:

Bad Query Plans:

- we upped the join_collapse_limit from the default value of 8 to 12 - Dan 
Wells discusses this possibility in his bug report about slow copy load in the 
OPAC (https://bugs.launchpad.net/evergreen/+bug/1527731).
- we altered the actor.org_unit_descendants, adding a realistic estimate of the 
number of rows to be expected:
     ALTER FUNCTION actor.org_unit_descendants(int) ROWS 300;
     ALTER FUNCTION actor.org_unit_descendants(int,int) ROWS 300;
- we dropped a duplicate index on the action.hold_request table that was 
apparently added to address an issue in the past (ahr_current_copy_open_idx)
- we added an index to action.circulation to address an issue with slow 
collections queries: 
     CREATE INDEX CONCURRENTLY circ_circ_lib_open_idx ON action.circulation
     (usr,circ_lib) WHERE xact_finish IS NULL;

PostgreSQL Memory Usage:

- we reduced our shared_buffers size to 32GB from 128GB - we had calculated the 
higher value from the PostgreSQL documentation 
(http://www.postgresql.org/docs/9.4/static/runtime-config-resource.html#GUC-SHARED-BUFFERS)
 recommendation of 1/4 of installed RAM (512GB in our case).  We immediately 
saw a reduction of resource consumption on the server.
- on the underlying OS, we turned off transparent huge pages (THP - see 
http://unix.stackexchange.com/questions/99154/disable-transparent-hugepages for 
details on how one might do this).  Further research revealed that turning off 
THP is generally recommended by Oracle and PostgreSQL.

We also followed Mike's recommendation to run the hold targeter cron job 
nightly rather than every 15 minutes, but that probably didn't have much to do 
with the root cause (aside from adding queries to the pile-up when slowness 
struck).

Since we made the above changes, our database servers have increased speed and 
are performing very happily.  I just wanted to share our experience with the 
Evergreen Community in hopes that others can benefit.

-- 
Chris Sharp
PINES System Administrator
Georgia Public Library Service
1800 Century Place, Suite 150
Atlanta, Georgia 30345
(404) 235-7147
csh...@georgialibraries.org
http://pines.georgialibraries.org/

Reply via email to