Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

2023-02-01 Thread Joshua D. Shaw
Upping the Solr timeout to 7200 seems to have prevented the commit timeout 
errors. I'll be interested to see if other institutions with large repos run 
into the same issue as more move to later versions of Solr. And whether the 
timeout time is a moving target as the number of objects in any one repo 
increases.

Current settings are stock except for the above timeout and I bumped the PUI 
thread count to 2. Solr memory at 4GB. Running on my laptop uncontainerized for 
testing. Will report back if any of that changes when I move to testing in 
containers and on our servers.

Potentially not related, but I did see a jetty idleTimeout error related to the 
notifications poll during the PUI indexer run. For some reason, that took 60s 
to respond vs the usual <10ms. Doing a bit of reading it seems like the jetty 
timeout was reduced from 5 minutes to 30 seconds in one of the jetty 9.x 
updates. AS's version of jetty was updated for 3.2.0 
(https://github.com/archivesspace/archivesspace/pull/2516) which may be why I 
have never seen this (running 3.1.1 currently)

A snippet of that in case anyone else has comments.

D, [2023-01-31T23:45:05.270418 #30794] DEBUG -- : Thread-10048: GET 
/notifications?last_sequence=1675201005 [session: nil]

...(Lots of PUI indexer requests which may​ have slowed the notifications 
thread response time?)...

D, [2023-01-31T23:46:05.277431 #30794] DEBUG -- : Thread-10048: Responded with 
[200, {"Content-Type"=>"application/json", "Cache-Control"=>"private, 
must-revalidate, max-age=0", "Content-Length"=>"3"}, ["[]\n"]]... in 60007ms
Jan 31, 2023 11:46:05 PM 
org.eclipse.jetty.server.handler.ContextHandler$Context log
WARNING: ERROR: couldn't handle exception (response is committed)
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout 
expired: 3/3 ms
...

This seems mosly harmless since the thread does respond eventually and only 
occurred once. Its a bit odd, though since everything is running locally.

Joshua

From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Joshua 
D. Shaw 
Sent: Monday, January 30, 2023 2:13 PM
To: archivesspace_users_group@lyralists.lyrasis.org 

Subject: Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

Thanks, Andrew & Blake, et al!

One of my thoughts was to run the index against an earlier 8.x version of Solr 
to see if there are any significant changes in the indexer time.

Failing that, I think my only option is to increase the timeout to about 2hrs.

@Brian - I don't know if this is related to the DST issue since the indexer 
does not​ restart the index round for me - like it typically does in those DST 
circumstances.

To elaborate on a couple of specific thoughts Blake brought up.


  1.  We only have about 31.5K records in the deleted table, but as Andrew 
pointed out, I'm pretty sure those aren't really relevant here.
  2.  I may be reading the configs wrong, but it looks like 'debug' is the 
default level 
(https://github.com/archivesspace/archivesspace/blob/v3.3.1/common/config/config-defaults.rb#L53<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Farchivesspace%2Farchivesspace%2Fblob%2Fv3.3.1%2Fcommon%2Fconfig%2Fconfig-defaults.rb%23L53&data=05%7C01%7Cjoshua.d.shaw%40dartmouth.edu%7C2e9a3fe3042a451b654f08db02f61080%7C995b093648d640e5a31ebf689ec9446f%7C0%7C0%7C638107028086755563%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=g%2BNbBlr7BHP9qdrixTVlvJuRJv9ITvA2s3yyrtP76Vw%3D&reserved=0>)
  3.  We do have some very 'wide' collections with many series and many files 
under each series. Though these collections tend not to be very 'deep' - ie 
only 3-4 levels.

I'd be interested to hear if other institutions with similar collections also 
experience the same timeout issues. Thanks, Peter, for chiming in.

Will update if/when I find out more.

Joshua


From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Andrew 
Morrison 
Sent: Monday, January 30, 2023 4:46 AM
To: archivesspace_users_group@lyralists.lyrasis.org 

Subject: Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?


Truncating the "deleted_records" table will prevent the OAI-PMH service from 
being able to send out deletion notifications. Maybe it is worth trying on a 
testing system, but probably not a good idea on a production system.


Also note that is this...


> Deleted 186992 documents


... is not logging the deletion of deleted records. It is logging the deletion 
of Solr documents for unpublished records. But few, if any, exist to be 
deleted. ArchivesSpace sent 186992 IDs to Solr to delete, just in case any of 
them were unpublished

Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

2023-01-30 Thread Joshua D. Shaw
Thanks, Andrew & Blake, et al!

One of my thoughts was to run the index against an earlier 8.x version of Solr 
to see if there are any significant changes in the indexer time.

Failing that, I think my only option is to increase the timeout to about 2hrs.

@Brian - I don't know if this is related to the DST issue since the indexer 
does not​ restart the index round for me - like it typically does in those DST 
circumstances.

To elaborate on a couple of specific thoughts Blake brought up.


  1.  We only have about 31.5K records in the deleted table, but as Andrew 
pointed out, I'm pretty sure those aren't really relevant here.
  2.  I may be reading the configs wrong, but it looks like 'debug' is the 
default level 
(https://github.com/archivesspace/archivesspace/blob/v3.3.1/common/config/config-defaults.rb#L53)
  3.  We do have some very 'wide' collections with many series and many files 
under each series. Though these collections tend not to be very 'deep' - ie 
only 3-4 levels.

I'd be interested to hear if other institutions with similar collections also 
experience the same timeout issues. Thanks, Peter, for chiming in.

Will update if/when I find out more.

Joshua


From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Andrew 
Morrison 
Sent: Monday, January 30, 2023 4:46 AM
To: archivesspace_users_group@lyralists.lyrasis.org 

Subject: Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?


Truncating the "deleted_records" table will prevent the OAI-PMH service from 
being able to send out deletion notifications. Maybe it is worth trying on a 
testing system, but probably not a good idea on a production system.


Also note that is this...


> Deleted 186992 documents


... is not logging the deletion of deleted records. It is logging the deletion 
of Solr documents for unpublished records. But few, if any, exist to be 
deleted. ArchivesSpace sent 186992 IDs to Solr to delete, just in case any of 
them were unpublished immediately before this index run. Solr returned a 200 OK 
response, even if none were found, so ArchivesSpace reports them as all deleted.


The other logged deletions, before that, are likewise not "real" 99.9% of the 
time. These batches...


> Deleted 100 documents


...are when ArchivesSpace tells Solr to delete the tree nodes for all archival 
objects without children, just in case any of them had children before this 
index run. Only the PUIIndexer does this, which is part of why it is slower (it 
is also allocated fewer threads in default config.) Again, 99.99% of the time 
there's nothing for Solr to delete, but it has to search its indexes for them 
anyway.


And these...


> Deleted 25 documents


...are the deletion of URIs in the deleted_records table. But most of those 
were deleted long ago by previous index runs. Again, it is just in case any new 
ones were recently deleted (and even those were probably deleted by the 
RealtimeIndexer.)


This "belt and braces" approach prevents a few stray records remaining in the 
PUI when they've been deleted or unpublished, but it seems to be the cause of 
the longest wait times for commits when re-indexing large repositories. Maybe 
something has changed in newer versions of Solr to make this process slower, 
possibly specifically for deletions?


Andrew.




On 27/01/2023 14:01, Blake Carver wrote:
> I'm running default config values for the AS log levels so they are all set 
> to 'debug'.

I'm seeing "INFO" and not "DEBUG" there.

> Deleted 186992 documents

How much is in the deleted_records table? Try truncating that. ArchivesSpace is 
going and deleting anything in that table.

> So I'm falling back to this just being super​ slow for some reason.

Could be some complex records, could be there's way too much in the deleted 
table.

> but I'm not sure why the PUI indexer would become so much slower (21 hours)

Yep, sounds about right. The PUI is slow.

> We do have some collections that are quite large (10s of thousands of AOs), 
> so maybe that's part of the issue.

No doubt that's slowing it down too.



From: 
archivesspace_users_group-boun...@lyralists.lyrasis.org<mailto:archivesspace_users_group-boun...@lyralists.lyrasis.org>
 
<mailto:archivesspace_users_group-boun...@lyralists.lyrasis.org>
 on behalf of Joshua D. Shaw 
<mailto:joshua.d.s...@dartmouth.edu>
Sent: Thursday, January 26, 2023 6:02 PM
To: Archivesspace Users Group 
<mailto:archivesspace_users_group@lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

Thanks, Blake!

I'm running default config values for the AS log levels so they are all set to 
'debug'. I took a closer look, and the timeou

Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

2023-01-30 Thread Andrew Morrison
Truncating the "deleted_records" table will prevent the OAI-PMH service 
from being able to send out deletion notifications. Maybe it is worth 
trying on a testing system, but probably not a good idea on a production 
system.



Also note that is this...


> Deleted 186992 documents


... is not logging the deletion of deleted records. It is logging the 
deletion of Solr documents for unpublished records. But few, if any, 
exist to be deleted. ArchivesSpace sent 186992 IDs to Solr to delete, 
just in case any of them were unpublished immediately before this index 
run. Solr returned a 200 OK response, even if none were found, so 
ArchivesSpace reports them as all deleted.



The other logged deletions, before that, are likewise not "real" 99.9% 
of the time. These batches...



> Deleted 100 documents


...are when ArchivesSpace tells Solr to delete the tree nodes for all 
archival objects without children, just in case any of them had children 
before this index run. Only the PUIIndexer does this, which is part of 
why it is slower (it is also allocated fewer threads in default config.) 
Again, 99.99% of the time there's nothing for Solr to delete, but it has 
to search its indexes for them anyway.



And these...


> Deleted 25 documents


...are the deletion of URIs in the deleted_records table. But most of 
those were deleted long ago by previous index runs. Again, it is just in 
case any new ones were recently deleted (and even those were probably 
deleted by the RealtimeIndexer.)



This "belt and braces" approach prevents a few stray records remaining 
in the PUI when they've been deleted or unpublished, but it seems to be 
the cause of the longest wait times for commits when re-indexing large 
repositories. Maybe something has changed in newer versions of Solr to 
make this process slower, possibly specifically for deletions?



Andrew.




On 27/01/2023 14:01, Blake Carver wrote:
> I'm running default config values for the AS log levels so they are 
all set to 'debug'.


I'm seeing "INFO" and not "DEBUG" there.

> Deleted 186992 documents

How much is in the deleted_records table? Try truncating that. 
ArchivesSpace is going and deleting anything in that table.


> So I'm falling back to this just being super​ slow for some reason.

Could be some complex records, could be there's way too much in the 
deleted table.


> but I'm not sure why the PUI indexer would become so much slower (21 
hours)


Yep, sounds about right. The PUI is slow.

> We do have some collections that are quite large (10s of thousands 
of AOs), so maybe that's part of the issue.


No doubt that's slowing it down too.



*From:* archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of 
Joshua D. Shaw 

*Sent:* Thursday, January 26, 2023 6:02 PM
*To:* Archivesspace Users Group 

*Subject:* Re: [Archivesspace_Users_Group] External Solr - Memory 
Allocation?

Thanks, Blake!

I'm running default config values for the AS log levels so they are 
all set to 'debug'. I took a closer look, and the timeout message 
happens exactly after the timeout amount I set (as you'd expect). 
Interestingly, Solr is in the middle of deleting documents when it 
goes silent


I, [2023-01-26T09:18:40.357101 #78764]  INFO -- : Thread-3384: Deleted 
100 documents: #


 40 minutes pass with all the other AS log chatter ...

E, [2023-01-26T09:58:40.400971 #78764] ERROR -- : Thread-3384: 
SolrIndexerError when deleting records: Timeout error with  POST {}
I, [2023-01-26T09:58:40.410522 #78764]  INFO -- : Thread-3384: Deleted 
100 documents: #


This continuing delete phase goes on for a bit until it stops logging 
batch deletes.


I, [2023-01-26T09:59:11.734200 #78764]  INFO -- : Thread-3384: Deleted 
9 documents: #


 40 minutes pass with all the other AS log chatter ... And then 
the commit error pops up


E, [2023-01-26T10:39:11.746166 #78764] ERROR -- : Thread-3384: 
SolrIndexerError when committing:

Timeout error with  POST {"commit":{"softCommit":false}}.

Then after some more time

I, [2023-01-26T11:06:35.678926 #78764]  INFO -- : Thread-3384: Deleted 
186992 documents: #


 This all seems to indicate to me that the commit phase is taking 
an inordinate amount of time (almost 2 hours - maybe that's what I 
need to set the timeout to?). After that, the indexer starts the 2nd repo


I, [2023-01-26T11:06:35.765797 #78764]  INFO -- : Thread-3384: PUI 
Indexer [2023-01-26 11:06:35 -0500] Indexed 2 additional PUI records 
in repository Sherman


 The indexer waits for a looong time with no timeout and no 
messaging - even though this is a tiny repo - and then starts the 3rd repo


I, [2023-01-26T11:31:32.795602 #78764]  INFO -- : Thread-3384: PUI 
Indexer [2023-01-26 11:31:32 -

Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

2023-01-28 Thread Brian Hoffman
<< Why does one even have to run a periodic indexer? Aren't there guarantees in
the system that updates are seen through to the index in realtime, do bulk
updates not trigger a refresh of updated records selectively? Reading the code
seems to suggest that updates are queued until processed, does AS need a more
durable queue? >>

In theory, if your index is “up to date” (according to the indexer_state 
directory) the periodic indexer should have no work to do. I think this is part 
of a class of problems that arise when for some reason the periodic indexer 
cannot get through its workload and therefore tries and tries again. That is 
what happens, for example, when a MySQL database contains a record with a bad 
timestamp due to DST. If someone could file a JIRA issue with as much info as 
possible for recreating the problem (and maybe someone who could be contacted 
to supply a database copy) then it could probably be prioritized and addressed.

From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Peter 
Heiner 
Date: Saturday, January 28, 2023 at 3:51 AM
To: Archivesspace Users Group 
Subject: Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?
Joshua D. Shaw wrote on 2023-01-26 23:02:10:
> Thanks, Blake!
>
> I'm running default config values for the AS log levels so they are all set 
> to 'debug'. I took a closer look, and the timeout message happens exactly 
> after the timeout amount I set (as you'd expect). Interestingly, Solr is in 
> the middle of deleting documents when it goes silent
>
> I, [2023-01-26T09:18:40.357101 #78764]  INFO -- : Thread-3384: Deleted 100 
> documents: #
>
>  40 minutes pass with all the other AS log chatter ...
>
> E, [2023-01-26T09:58:40.400971 #78764] ERROR -- : Thread-3384: 
> SolrIndexerError when deleting records: Timeout error with  POST {}
> I, [2023-01-26T09:58:40.410522 #78764]  INFO -- : Thread-3384: Deleted 100 
> documents: #
>
> This continuing delete phase goes on for a bit until it stops logging batch 
> deletes.
>
> I, [2023-01-26T09:59:11.734200 #78764]  INFO -- : Thread-3384: Deleted 9 
> documents: #
>
>  40 minutes pass with all the other AS log chatter ... And then the 
> commit error pops up
>
> E, [2023-01-26T10:39:11.746166 #78764] ERROR -- : Thread-3384: 
> SolrIndexerError when committing:
> Timeout error with  POST {"commit":{"softCommit":false}}.
>
> Then after some more time
>
> I, [2023-01-26T11:06:35.678926 #78764]  INFO -- : Thread-3384: Deleted 186992 
> documents: #
>
>  This all seems to indicate to me that the commit phase is taking an 
> inordinate amount of time (almost 2 hours - maybe that's what I need to set 
> the timeout to?). After that, the indexer starts the 2nd repo

We're experiencing the exact same issue at least in the largest of our 30-odd
repositories:

I, [2023-01-28T07:24:48.015356 #2036632]  INFO -- : Thread-2006: Staff Indexer 
[2023-01-28 07:24:48 +] ~~~ Indexed 536300 of 587664 archival_object 
records in repository CUL
E, [2023-01-28T07:25:47.217953 #2036632] ERROR -- : Thread-2016: 
SolrIndexerError when committing:
Timeout error with  POST {"commit":{"softCommit":false}}.

We've had this problem from the start but were unable to dig deeper because
our in-house monitoring wasn't granular or even capable enough. Our crude
solution was using an external Solr since 2.5 and an external indexer since
around 2.7, and periodic restarts out of hours, but we've started getting
problems despite that. Our Solr is allocated 6GB of memory and timeout is set
to 1200 seconds, but the problem is that searches in AS fail during the wait
and that makes AS unusable, so we're reluctant to increase that.

Why does one even have to run a periodic indexer? Aren't there guarantees in
the system that updates are seen through to the index in realtime, do bulk
updates not trigger a refresh of updated records selectively? Reading the code
seems to suggest that updates are queued until processed, does AS need a more
durable queue?

Thanks,
p
___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

2023-01-28 Thread Peter Heiner
Joshua D. Shaw wrote on 2023-01-26 23:02:10:
> Thanks, Blake!
> 
> I'm running default config values for the AS log levels so they are all set 
> to 'debug'. I took a closer look, and the timeout message happens exactly 
> after the timeout amount I set (as you'd expect). Interestingly, Solr is in 
> the middle of deleting documents when it goes silent
> 
> I, [2023-01-26T09:18:40.357101 #78764]  INFO -- : Thread-3384: Deleted 100 
> documents: #
> 
>  40 minutes pass with all the other AS log chatter ...
> 
> E, [2023-01-26T09:58:40.400971 #78764] ERROR -- : Thread-3384: 
> SolrIndexerError when deleting records: Timeout error with  POST {}
> I, [2023-01-26T09:58:40.410522 #78764]  INFO -- : Thread-3384: Deleted 100 
> documents: #
> 
> This continuing delete phase goes on for a bit until it stops logging batch 
> deletes.
> 
> I, [2023-01-26T09:59:11.734200 #78764]  INFO -- : Thread-3384: Deleted 9 
> documents: #
> 
>  40 minutes pass with all the other AS log chatter ... And then the 
> commit error pops up
> 
> E, [2023-01-26T10:39:11.746166 #78764] ERROR -- : Thread-3384: 
> SolrIndexerError when committing:
> Timeout error with  POST {"commit":{"softCommit":false}}.
> 
> Then after some more time
> 
> I, [2023-01-26T11:06:35.678926 #78764]  INFO -- : Thread-3384: Deleted 186992 
> documents: #
> 
>  This all seems to indicate to me that the commit phase is taking an 
> inordinate amount of time (almost 2 hours - maybe that's what I need to set 
> the timeout to?). After that, the indexer starts the 2nd repo

We're experiencing the exact same issue at least in the largest of our 30-odd
repositories:

I, [2023-01-28T07:24:48.015356 #2036632]  INFO -- : Thread-2006: Staff Indexer 
[2023-01-28 07:24:48 +] ~~~ Indexed 536300 of 587664 archival_object 
records in repository CUL
E, [2023-01-28T07:25:47.217953 #2036632] ERROR -- : Thread-2016: 
SolrIndexerError when committing: 
Timeout error with  POST {"commit":{"softCommit":false}}.

We've had this problem from the start but were unable to dig deeper because
our in-house monitoring wasn't granular or even capable enough. Our crude
solution was using an external Solr since 2.5 and an external indexer since
around 2.7, and periodic restarts out of hours, but we've started getting
problems despite that. Our Solr is allocated 6GB of memory and timeout is set
to 1200 seconds, but the problem is that searches in AS fail during the wait
and that makes AS unusable, so we're reluctant to increase that.

Why does one even have to run a periodic indexer? Aren't there guarantees in
the system that updates are seen through to the index in realtime, do bulk
updates not trigger a refresh of updated records selectively? Reading the code
seems to suggest that updates are queued until processed, does AS need a more
durable queue?

Thanks,
p
___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

2023-01-27 Thread Blake Carver
> I'm running default config values for the AS log levels so they are all set 
> to 'debug'.

I'm seeing "INFO" and not "DEBUG" there.

> Deleted 186992 documents

How much is in the deleted_records table? Try truncating that. ArchivesSpace is 
going and deleting anything in that table.

> So I'm falling back to this just being super​ slow for some reason.

Could be some complex records, could be there's way too much in the deleted 
table.

> but I'm not sure why the PUI indexer would become so much slower (21 hours)

Yep, sounds about right. The PUI is slow.

> We do have some collections that are quite large (10s of thousands of AOs), 
> so maybe that's part of the issue.

No doubt that's slowing it down too.



From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Joshua 
D. Shaw 
Sent: Thursday, January 26, 2023 6:02 PM
To: Archivesspace Users Group 
Subject: Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

Thanks, Blake!

I'm running default config values for the AS log levels so they are all set to 
'debug'. I took a closer look, and the timeout message happens exactly after 
the timeout amount I set (as you'd expect). Interestingly, Solr is in the 
middle of deleting documents when it goes silent

I, [2023-01-26T09:18:40.357101 #78764]  INFO -- : Thread-3384: Deleted 100 
documents: #

 40 minutes pass with all the other AS log chatter ...

E, [2023-01-26T09:58:40.400971 #78764] ERROR -- : Thread-3384: SolrIndexerError 
when deleting records: Timeout error with  POST {}
I, [2023-01-26T09:58:40.410522 #78764]  INFO -- : Thread-3384: Deleted 100 
documents: #

This continuing delete phase goes on for a bit until it stops logging batch 
deletes.

I, [2023-01-26T09:59:11.734200 #78764]  INFO -- : Thread-3384: Deleted 9 
documents: #

 40 minutes pass with all the other AS log chatter ... And then the commit 
error pops up

E, [2023-01-26T10:39:11.746166 #78764] ERROR -- : Thread-3384: SolrIndexerError 
when committing:
Timeout error with  POST {"commit":{"softCommit":false}}.

Then after some more time

I, [2023-01-26T11:06:35.678926 #78764]  INFO -- : Thread-3384: Deleted 186992 
documents: #

 This all seems to indicate to me that the commit phase is taking an 
inordinate amount of time (almost 2 hours - maybe that's what I need to set the 
timeout to?). After that, the indexer starts the 2nd repo

I, [2023-01-26T11:06:35.765797 #78764]  INFO -- : Thread-3384: PUI Indexer 
[2023-01-26 11:06:35 -0500] Indexed 2 additional PUI records in repository 
Sherman

 The indexer waits for a looong time with no timeout and no messaging - 
even though this is a tiny repo - and then starts the 3rd repo

I, [2023-01-26T11:31:32.795602 #78764]  INFO -- : Thread-3384: PUI Indexer 
[2023-01-26 11:31:32 -0500] Indexed 188 additional PUI records in repository 
Rauner-XO

And then the indexer starts the 4th repo soon after and seems to go on to 
complete normally

I, [2023-01-26T11:31:33.369369 #78764]  INFO -- : Thread-3384: PUI Indexer 
[2023-01-26 11:31:33 -0500] ~~~ Indexed 25 of 74785 archival_object records in 
repository thedartmouth

The Solr logs indicate that Solr is working this entire time doing adds and 
deletes. For example in one of the quiet phases:

2023-01-26 10:23:35.928 INFO  (qtp2101153819-523) [   x:archivesspace] 
o.a.s.u.p.LogUpdateProcessorFactory [archivesspace]  webapp=/solr path=/update 
params={}{add=...
2023-01-26 10:23:38.195 INFO  (qtp2101153819-468) [   x:archivesspace] 
o.a.s.u.p.LogUpdateProcessorFactory [archivesspace]  webapp=/solr path=/update 
params={}{deleteByQuery=...

So I'm falling back to this just being super​ slow for some reason. I do have 
some custom indexer addons, but I'm not sure why the PUI indexer would become 
so much slower (21 hours) when the Staff indexer completes in a normal amount 
of time (a little under 6 hours). For previous versions this hasn't been quite 
that different (6hrs vs about 13hrs). We do have some collections that are 
quite large (10s of thousands of AOs), so maybe that's part of the issue.

I haven't checked to see if the PUI indexer is gathering that much more data 
(and traversing the tree more times - maybe?) than it was in 3.1.1, but that's 
on my 'to check' list.

Joshua

From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Blake 
Carver 
Sent: Thursday, January 26, 2023 4:12 PM
To: Archivesspace Users Group 
Subject: Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

That's... interesting.

That RAM allocation seems fine. That Solr timeout is way higher than I would 
think is needed.

Maybe set the loglevel to debug and see if it spits out something more useful? 
Maybe you'll be able to see wh

Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

2023-01-26 Thread Joshua D. Shaw
Thanks, Blake!

I'm running default config values for the AS log levels so they are all set to 
'debug'. I took a closer look, and the timeout message happens exactly after 
the timeout amount I set (as you'd expect). Interestingly, Solr is in the 
middle of deleting documents when it goes silent

I, [2023-01-26T09:18:40.357101 #78764]  INFO -- : Thread-3384: Deleted 100 
documents: #

 40 minutes pass with all the other AS log chatter ...

E, [2023-01-26T09:58:40.400971 #78764] ERROR -- : Thread-3384: SolrIndexerError 
when deleting records: Timeout error with  POST {}
I, [2023-01-26T09:58:40.410522 #78764]  INFO -- : Thread-3384: Deleted 100 
documents: #

This continuing delete phase goes on for a bit until it stops logging batch 
deletes.

I, [2023-01-26T09:59:11.734200 #78764]  INFO -- : Thread-3384: Deleted 9 
documents: #

 40 minutes pass with all the other AS log chatter ... And then the commit 
error pops up

E, [2023-01-26T10:39:11.746166 #78764] ERROR -- : Thread-3384: SolrIndexerError 
when committing:
Timeout error with  POST {"commit":{"softCommit":false}}.

Then after some more time

I, [2023-01-26T11:06:35.678926 #78764]  INFO -- : Thread-3384: Deleted 186992 
documents: #

 This all seems to indicate to me that the commit phase is taking an 
inordinate amount of time (almost 2 hours - maybe that's what I need to set the 
timeout to?). After that, the indexer starts the 2nd repo

I, [2023-01-26T11:06:35.765797 #78764]  INFO -- : Thread-3384: PUI Indexer 
[2023-01-26 11:06:35 -0500] Indexed 2 additional PUI records in repository 
Sherman

 The indexer waits for a looong time with no timeout and no messaging - 
even though this is a tiny repo - and then starts the 3rd repo

I, [2023-01-26T11:31:32.795602 #78764]  INFO -- : Thread-3384: PUI Indexer 
[2023-01-26 11:31:32 -0500] Indexed 188 additional PUI records in repository 
Rauner-XO

And then the indexer starts the 4th repo soon after and seems to go on to 
complete normally

I, [2023-01-26T11:31:33.369369 #78764]  INFO -- : Thread-3384: PUI Indexer 
[2023-01-26 11:31:33 -0500] ~~~ Indexed 25 of 74785 archival_object records in 
repository thedartmouth

The Solr logs indicate that Solr is working this entire time doing adds and 
deletes. For example in one of the quiet phases:

2023-01-26 10:23:35.928 INFO  (qtp2101153819-523) [   x:archivesspace] 
o.a.s.u.p.LogUpdateProcessorFactory [archivesspace]  webapp=/solr path=/update 
params={}{add=...
2023-01-26 10:23:38.195 INFO  (qtp2101153819-468) [   x:archivesspace] 
o.a.s.u.p.LogUpdateProcessorFactory [archivesspace]  webapp=/solr path=/update 
params={}{deleteByQuery=...

So I'm falling back to this just being super​ slow for some reason. I do have 
some custom indexer addons, but I'm not sure why the PUI indexer would become 
so much slower (21 hours) when the Staff indexer completes in a normal amount 
of time (a little under 6 hours). For previous versions this hasn't been quite 
that different (6hrs vs about 13hrs). We do have some collections that are 
quite large (10s of thousands of AOs), so maybe that's part of the issue.

I haven't checked to see if the PUI indexer is gathering that much more data 
(and traversing the tree more times - maybe?) than it was in 3.1.1, but that's 
on my 'to check' list.

Joshua

From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Blake 
Carver 
Sent: Thursday, January 26, 2023 4:12 PM
To: Archivesspace Users Group 
Subject: Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

That's... interesting.

That RAM allocation seems fine. That Solr timeout is way higher than I would 
think is needed.

Maybe set the loglevel to debug and see if it spits out something more useful? 
Maybe you'll be able to see what it's up to during that long time. I like 
your theory on that.



From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Joshua 
D. Shaw 
Sent: Thursday, January 26, 2023 3:38 PM
To: Archivesspace Users Group 
Subject: Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

Following up on this. And looking for some advice!

Even with the Solr timeout set to 40 minutes, I'm seeing some random Solr 
Timeout errors, though these do NOT cause the indexer to restart. In the latest 
test run I see one Solr Timeout for delete and one for commit - both following 
the PUI indexer run for AOs for the first and largest repo (~630k AOs).

The indexer throws the delete timeout error, waits for a lng time with 
seemingly no activity, throws the commit timeout error, waits again, and then 
picks back up as if nothing had gone wrong and continues with the initial index 
run. All of the index data looks correct (ie correct number of objects in both 
the staff and PUI).

My theory is that the Sol

Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

2023-01-26 Thread Blake Carver
That's... interesting.

That RAM allocation seems fine. That Solr timeout is way higher than I would 
think is needed.

Maybe set the loglevel to debug and see if it spits out something more useful? 
Maybe you'll be able to see what it's up to during that long time. I like 
your theory on that.



From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Joshua 
D. Shaw 
Sent: Thursday, January 26, 2023 3:38 PM
To: Archivesspace Users Group 
Subject: Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

Following up on this. And looking for some advice!

Even with the Solr timeout set to 40 minutes, I'm seeing some random Solr 
Timeout errors, though these do NOT cause the indexer to restart. In the latest 
test run I see one Solr Timeout for delete and one for commit - both following 
the PUI indexer run for AOs for the first and largest repo (~630k AOs).

The indexer throws the delete timeout error, waits for a lng time with 
seemingly no activity, throws the commit timeout error, waits again, and then 
picks back up as if nothing had gone wrong and continues with the initial index 
run. All of the index data looks correct (ie correct number of objects in both 
the staff and PUI).

My theory is that the Solr update phase really is taking a super lng time, 
but that the data has all been sent to Solr so the timeouts are really just 
ArchivesSpace waiting for Solr in between indexing one object type and the next 
and no index data is lost.

There are no corresponding log entries in the Solr logs that I can find.

I'm running solr 8.11.6 with 4GB and AS 3.3.1 with 4GB. Both bare metal on my 
laptop, so no container issues that might be at play. Solr memory use peaks at 
around 3.5GB.

I've kept the stock thread and records per thread settings and just upped the 
timeout (to 2400). I guess the next step is to set the timeout even higher - 
maybe an hour (3600)? I don't see a reason to run a lower thread or record 
count, but can certainly try that as well, though I'm not looking forward to 
the time it will take (the current run takes 21 hours as it is - up from about 
15 for 3.1.1)

Any advice appreciated!

Thanks!
Joshua


From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Joshua 
D. Shaw 
Sent: Tuesday, January 24, 2023 6:56 AM
To: Archivesspace Users Group 
Subject: [Archivesspace_Users_Group] External Solr - Memory Allocation?

Hey all

We're about to jump to v3.3.1 and I'm wondering if anyone has any suggestions 
for memory allocation for Solr?

Currently we're running 6GB for the entire suite in v3.1.1 and are looking to 
keep the same overall memory footprint. Wondering if something like a 75/25 
split (ie 4GB for AS and 2GB for Solr) would be a reasonable allocation? Or are 
people finding that Solr is more demanding?

Thanks!
Joshua

___
Joshua Shaw (he, him)
Library Web & Application Developer
Digital Library Technologies Group
Dartmouth College
603.646.0405
___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?

2023-01-26 Thread Joshua D. Shaw
Following up on this. And looking for some advice!

Even with the Solr timeout set to 40 minutes, I'm seeing some random Solr 
Timeout errors, though these do NOT cause the indexer to restart. In the latest 
test run I see one Solr Timeout for delete and one for commit - both following 
the PUI indexer run for AOs for the first and largest repo (~630k AOs).

The indexer throws the delete timeout error, waits for a lng time with 
seemingly no activity, throws the commit timeout error, waits again, and then 
picks back up as if nothing had gone wrong and continues with the initial index 
run. All of the index data looks correct (ie correct number of objects in both 
the staff and PUI).

My theory is that the Solr update phase really is taking a super lng time, 
but that the data has all been sent to Solr so the timeouts are really just 
ArchivesSpace waiting for Solr in between indexing one object type and the next 
and no index data is lost.

There are no corresponding log entries in the Solr logs that I can find.

I'm running solr 8.11.6 with 4GB and AS 3.3.1 with 4GB. Both bare metal on my 
laptop, so no container issues that might be at play. Solr memory use peaks at 
around 3.5GB.

I've kept the stock thread and records per thread settings and just upped the 
timeout (to 2400). I guess the next step is to set the timeout even higher - 
maybe an hour (3600)? I don't see a reason to run a lower thread or record 
count, but can certainly try that as well, though I'm not looking forward to 
the time it will take (the current run takes 21 hours as it is - up from about 
15 for 3.1.1)

Any advice appreciated!

Thanks!
Joshua


From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Joshua 
D. Shaw 
Sent: Tuesday, January 24, 2023 6:56 AM
To: Archivesspace Users Group 
Subject: [Archivesspace_Users_Group] External Solr - Memory Allocation?

Hey all

We're about to jump to v3.3.1 and I'm wondering if anyone has any suggestions 
for memory allocation for Solr?

Currently we're running 6GB for the entire suite in v3.1.1 and are looking to 
keep the same overall memory footprint. Wondering if something like a 75/25 
split (ie 4GB for AS and 2GB for Solr) would be a reasonable allocation? Or are 
people finding that Solr is more demanding?

Thanks!
Joshua

___
Joshua Shaw (he, him)
Library Web & Application Developer
Digital Library Technologies Group
Dartmouth College
603.646.0405
___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


[Archivesspace_Users_Group] External Solr - Memory Allocation?

2023-01-24 Thread Joshua D. Shaw
Hey all

We're about to jump to v3.3.1 and I'm wondering if anyone has any suggestions 
for memory allocation for Solr?

Currently we're running 6GB for the entire suite in v3.1.1 and are looking to 
keep the same overall memory footprint. Wondering if something like a 75/25 
split (ie 4GB for AS and 2GB for Solr) would be a reasonable allocation? Or are 
people finding that Solr is more demanding?

Thanks!
Joshua

___
Joshua Shaw (he, him)
Library Web & Application Developer
Digital Library Technologies Group
Dartmouth College
603.646.0405
___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group