Hello everyone,

I am trying to use DSpace 6.4 stats-util with '-s' parameter to shard the 
DSpace statistics to separate cores based on the year. When I let the 
utility run for a while, I get a following error report from Nagios:

State: CRITICAL

Date/Time: Wed Jan 11 10:08:09 CET 2023 

Additional Info: CRITICAL - Socket timeout

When I try to access the user interface (XMLUI), I get a the following 
error:

...
Caused by: org.hibernate.exception.GenericJDBCException: Could not open 
connection
+
Caused by: java.sql.SQLException: Cannot get a connection, pool error 
Timeout waiting for idle object
...

I think this is somehow related to running stat-util script and I don't 
know how to solve this issue.

What I did:
1. I run the 'stats-util -s' command from terminal, I get the initial 
message:

'Moving: 45459190 into core statistics-2021' 

This means, that stats-util is trying to move 45 459 190 individual 
statistics SOLR records to a new shard statistics-2021. 

2. Then I run `tail /opt/dspace/log/solr.log -f | grep 
"org\.apache\.solr\.core\.SolrCore \@ \[statistics\] webapp\=\/solr 
path\=\/select"` to monitor the progress:

I can see that stats-util is selecting statistics records from SOLR in a 
batch of 10 000 records and I can monitor query time in the QTime 
attribute. Then example line I get from solr.log is shown below:

2023-01-11 10:53:07,617 INFO  org.apache.solr.core.SolrCore @ [statistics] 
webapp=/solr path=/select 
params={csv.mv.separator=|&q=*:*&csv.escape=\&start=340000&fq=time:([2021\-01\-01T00\:00\:00Z+TO+2022\-01\-01T00\:00\:00Z]+NOT+2022\-01\-01T00\:00\:00Z)&rows=10000&wt=csv}
 
hits=45459190 status=0 *QTime=650*

You can see, that QTime value is initially under a second, but as the 
script runs for a while, QTime value gradually rises and reaches times over 
several minutes per query.

3. When I get to approximately 8 000 000 processed records (as indicated by 
the value of 'start' parametr in the ?select query, Nagios starts reporting 
socket timeout (as described above) and after accessing user interface, I 
get the 'SQLException: Cannot get a connection, pool error Timeout waiting 
for idle object' (as described above).

4. I have to terminate the stats-util process and restart postgress and/or 
tomcat to resume the normal operation of our DSpace installation.

Our DSpace installation details:

DSpace version: 6.4
Java Runtime Environment Version: 1.8.0_352
Java Runtime Environment Vendor: OpenJDK 64-Bit Server VM
Operating System Version: Centos 7 3.10.0-1160.62.1.el7.x86_64
Total memory available on server: 24 GB
Tomcat memory assigned: 3GB
DSpace cmd tools memory assigned: 5GB
Database: Postgresql 9.6
DB related configuration in local.cfg:
- db.maxconnections = 100
- db.maxwait = 5000
- db.maxidle = 30

I would appreciate any insights into this issue and any help solving this. 

Thank you,
with best regards,

Jakub Řihák
Central Library
Charles University

-- 
All messages to this mailing list should adhere to the Code of Conduct: 
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/822c33f5-2ac7-4d1d-905f-4fa92afdc7a8n%40googlegroups.com.

Reply via email to