Hello everyone,
I am trying to use DSpace 6.4 stats-util with '-s' parameter to shard the
DSpace statistics to separate cores based on the year. When I let the
utility run for a while, I get a following error report from Nagios:
State: CRITICAL
Date/Time: Wed Jan 11 10:08:09 CET 2023
Additional Info: CRITICAL - Socket timeout
When I try to access the user interface (XMLUI), I get a the following
error:
...
Caused by: org.hibernate.exception.GenericJDBCException: Could not open
connection
+
Caused by: java.sql.SQLException: Cannot get a connection, pool error
Timeout waiting for idle object
...
I think this is somehow related to running stat-util script and I don't
know how to solve this issue.
What I did:
1. I run the 'stats-util -s' command from terminal, I get the initial
message:
'Moving: 45459190 into core statistics-2021'
This means, that stats-util is trying to move 45 459 190 individual
statistics SOLR records to a new shard statistics-2021.
2. Then I run `tail /opt/dspace/log/solr.log -f | grep
"org\.apache\.solr\.core\.SolrCore \@ \[statistics\] webapp\=\/solr
path\=\/select"` to monitor the progress:
I can see that stats-util is selecting statistics records from SOLR in a
batch of 10 000 records and I can monitor query time in the QTime
attribute. Then example line I get from solr.log is shown below:
2023-01-11 10:53:07,617 INFO org.apache.solr.core.SolrCore @ [statistics]
webapp=/solr path=/select
params={csv.mv.separator=|&q=*:*&csv.escape=\&start=340000&fq=time:([2021\-01\-01T00\:00\:00Z+TO+2022\-01\-01T00\:00\:00Z]+NOT+2022\-01\-01T00\:00\:00Z)&rows=10000&wt=csv}
hits=45459190 status=0 *QTime=650*
You can see, that QTime value is initially under a second, but as the
script runs for a while, QTime value gradually rises and reaches times over
several minutes per query.
3. When I get to approximately 8 000 000 processed records (as indicated by
the value of 'start' parametr in the ?select query, Nagios starts reporting
socket timeout (as described above) and after accessing user interface, I
get the 'SQLException: Cannot get a connection, pool error Timeout waiting
for idle object' (as described above).
4. I have to terminate the stats-util process and restart postgress and/or
tomcat to resume the normal operation of our DSpace installation.
Our DSpace installation details:
DSpace version: 6.4
Java Runtime Environment Version: 1.8.0_352
Java Runtime Environment Vendor: OpenJDK 64-Bit Server VM
Operating System Version: Centos 7 3.10.0-1160.62.1.el7.x86_64
Total memory available on server: 24 GB
Tomcat memory assigned: 3GB
DSpace cmd tools memory assigned: 5GB
Database: Postgresql 9.6
DB related configuration in local.cfg:
- db.maxconnections = 100
- db.maxwait = 5000
- db.maxidle = 30
I would appreciate any insights into this issue and any help solving this.
Thank you,
with best regards,
Jakub Řihák
Central Library
Charles University
--
All messages to this mailing list should adhere to the Code of Conduct:
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
---
You received this message because you are subscribed to the Google Groups
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/dspace-tech/822c33f5-2ac7-4d1d-905f-4fa92afdc7a8n%40googlegroups.com.