[dspace-tech] Dspace production environment

Seth Robbins Wed, 04 Jan 2017 10:23:46 -0800

Hi,

We're using DSpace5 with a custom-themed XMLUI. We're still using the old 
lucene search (because of some legacy customizations) and an old version of 
postgres (8.4, since campus IT, who runs our production environment, isn't 
supporting anything more recent). Were working to upgrade to discovery and 
postgres 9.5, so that may help alleviate some issues. 
We also have a fairly large archive ~ 90k items. 
Lately, we've come up against multiple situations where non-search engine 
crawlers or other automated users will hit us with a large number of 
requests (most recently with spikes around 100 requests/second). This seems 
to consistently bring down the server and I end up having to block the 
offending ip addresses. This seems to happen every few weeks. 
Usually I find errors like java.lang.OutOfMemoryError: unable to create new 
native thread or org.postgresql.util.PSQLException: Connection rejected: 
could not fork new process for connection: Resource temporarily unavailable.
So It seems like we're running up against OS system limits rather than 
running out of jvm memory.


I've been logging the number of postgres processes, open files, httpd 
processes, etc, and these all seem to get very high when we have an issue 
with DSpace crashing with the above error. 
I still haven't gotten to the bottom of the problem but this has generated 
some questions about how others run DSpace in production:

First off, should I be able to handle a load of 100 requests per second? We 
usually see between 2-10 with occasionally higher spikes up to around 50. 
 What load should the system be prepared to handle?

As I said Campus IT runs our production VM and I'll need to coordinate with 
them to change system parameters like ulimits, but it seems like the 
problem could be related to these being too low. 
Have others who use linux environments had to change or alter system 
ulimits? 

I've noticed that the problem seems to occur when we have a lot of requests 
for pages rather than bitstreams. Is it possible that the xslt processing 
is causing a backup? Have others experimented with using xslt processor's 
besides xalan?  

I've also never seen the number of postgres processes go down, so If I 
leave the system running for a week or so without restarting the number of 
open files and postgres connections gets very large. I know there have been 
perennial issues with the connection pool, but I'm wondering if these have 
been resolved with the move to Hibernate, or if this should be resolved 
when we upgrade our postgres?

Thanks,
Seth Robbins

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

[dspace-tech] Dspace production environment

Reply via email to