Re: [Archivesspace_Users_Group] PUI indexing issues

2021-03-16 Thread Blake Carver
I can only answer some of those.

- Staff indexing is done and has its files written. So does the number of 
threads given to that make a difference? Is it still taking up resources?

Not so much if it's not doing anything.

- Does there happen to be any way to stop the staff indexing and just let PUI 
have full access to the server for indexing?

You can disable either indexer, but that requires a restart. There's a setting 
in the config. The PUI is just slower than the Staff.

- Should our repositories be broken up into smaller groupings?  I'm beginning 
to wonder if we have things set up incorrectly, since it sounds like we have a 
very large data set compared to others.

It's probably not the total number of resources in a repo, just that the 
resources are quite large.



From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Tom 
Hanstra 
Sent: Tuesday, March 16, 2021 1:52 PM
To: Archivesspace Users Group 
Subject: Re: [Archivesspace_Users_Group] PUI indexing issues

Thanks for the suggestion, Blake. A couple additional questions:

- Staff indexing is done and has its files written. So does the number of 
threads given to that make a difference? Is it still taking up resources?

- Does there happen to be any way to stop the staff indexing and just let PUI 
have full access to the server for indexing?

- What really bothers me is the slowdown. That indicates to me that some 
resource is being lost along the way. Anyone have thoughts on what that might 
be?

- Should our repositories be broken up into smaller groupings?  I'm beginning 
to wonder if we have things set up incorrectly, since it sounds like we have a 
very large data set compared to others.


And a comment

It is really frustrating to have to start over on the indexing each time. It 
seems that there should be some way to document progress along the way so that 
the indexing can pick up where it left off. Is that something that might also 
be looked at?

Thanks all. Appreciate your help.

Tom


On Tue, Mar 16, 2021 at 1:15 PM Blake Carver 
mailto:blake.car...@lyrasis.org>> wrote:
> I've now left my PUI indexing threads and count at the default (which I 
> believe is 1 thread and 25 records/thread).

Try dropping both indexer_records_per_thread and indexer_thread_count for both 
PUI and Staff indexers. Maybe in half or so. Sometimes with larger records it 
just needs to be slowed down.

From: 
archivesspace_users_group-boun...@lyralists.lyrasis.org
 
mailto:archivesspace_users_group-boun...@lyralists.lyrasis.org>>
 on behalf of Tom Hanstra mailto:hans...@nd.edu>>
Sent: Tuesday, March 16, 2021 12:51 PM
To: Archivesspace Users Group 
mailto:archivesspace_users_group@lyralists.lyrasis.org>>
Subject: [Archivesspace_Users_Group] PUI indexing issues

Hello again.

I'm still trying to understand some indexing issues. I've now left my PUI 
indexing threads and count at the default (which I believe is 1 thread and 25 
records/thread). And I have given 4GB to Java processes. I've tried other 
values as well, but with similar results.

No matter what values I use, I cannot seem to fully index PUI. Each time, it 
will start well but continuously slow down. I've kept a spreadsheet of the 
number of records/hr I'm indexing and have several attempts which start in the 
50-60K/hr range and then continuously slow down to the 1800-1500/hr speed until 
finally dying with a Java Heap error. I think I'm headed to that again this 
round.

Why might this be happening?  Could my data have been corrupted during the 
transfer from Lyrasis? (I'm working with a database export of our production 
data). Is the database too far away (our database is in an AWS RDS being 
accessed from our AWS EC2).

I do have one log which gave this error:

E, [2021-03-12T18:14:53.886243 #2919] ERROR -- : Thread-9472: Failed fetching 
archival_object id=1484623: too many connection resets (due to Net::ReadTimeout 
- Net::ReadTimeout) after 0 requests on 3150, last used 1615590893.870297 
seconds
ago

prior to the Java Heap error. In that log, there were a number of connections 
for the staff indexer after the PUI indexer stopped reporting, then an 88 
minute gap prior to the above connection error and then finally a Java Heap 
error in the archivesspace.out log.

Does the indexer reauthenticate each time it connects to get more information?  
The earlier question about authentication has me wondering if my database 
server might be balking at the number of reconnections or something. I'm trying 
to index 760K records.

Bottom line is that I'm still not getting my PUI index creation to complete. 
Each run can take several days before it finally fails and I have to start all 
over again.  I'm looking for any help to track down why this slowdown is 
occurring and what I can do to address it.

Thanks,
Tom
--
Tom Hanstra
Sr. Systems 

Re: [Archivesspace_Users_Group] PUI indexing issues

2021-03-16 Thread Blake Carver
> I've now left my PUI indexing threads and count at the default (which I 
> believe is 1 thread and 25 records/thread).

Try dropping both indexer_records_per_thread and indexer_thread_count for both 
PUI and Staff indexers. Maybe in half or so. Sometimes with larger records it 
just needs to be slowed down.

From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 on behalf of Tom 
Hanstra 
Sent: Tuesday, March 16, 2021 12:51 PM
To: Archivesspace Users Group 
Subject: [Archivesspace_Users_Group] PUI indexing issues

Hello again.

I'm still trying to understand some indexing issues. I've now left my PUI 
indexing threads and count at the default (which I believe is 1 thread and 25 
records/thread). And I have given 4GB to Java processes. I've tried other 
values as well, but with similar results.

No matter what values I use, I cannot seem to fully index PUI. Each time, it 
will start well but continuously slow down. I've kept a spreadsheet of the 
number of records/hr I'm indexing and have several attempts which start in the 
50-60K/hr range and then continuously slow down to the 1800-1500/hr speed until 
finally dying with a Java Heap error. I think I'm headed to that again this 
round.

Why might this be happening?  Could my data have been corrupted during the 
transfer from Lyrasis? (I'm working with a database export of our production 
data). Is the database too far away (our database is in an AWS RDS being 
accessed from our AWS EC2).

I do have one log which gave this error:

E, [2021-03-12T18:14:53.886243 #2919] ERROR -- : Thread-9472: Failed fetching 
archival_object id=1484623: too many connection resets (due to Net::ReadTimeout 
- Net::ReadTimeout) after 0 requests on 3150, last used 1615590893.870297 
seconds
ago

prior to the Java Heap error. In that log, there were a number of connections 
for the staff indexer after the PUI indexer stopped reporting, then an 88 
minute gap prior to the above connection error and then finally a Java Heap 
error in the archivesspace.out log.

Does the indexer reauthenticate each time it connects to get more information?  
The earlier question about authentication has me wondering if my database 
server might be balking at the number of reconnections or something. I'm trying 
to index 760K records.

Bottom line is that I'm still not getting my PUI index creation to complete. 
Each run can take several days before it finally fails and I have to start all 
over again.  I'm looking for any help to track down why this slowdown is 
occurring and what I can do to address it.

Thanks,
Tom
--
Tom Hanstra
Sr. Systems Administrator
hans...@nd.edu

[https://docs.google.com/uc?export=download=1GFX1KaaMTtQ2Kg2u8bMXt1YwBp96bvf0=0B7APN9POn6xAQ244WWFYMFU3aVJwZ0lxbmVHK3FxNXlCd0RRPQ]
___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


[Archivesspace_Users_Group] Webinar Announcement: Implementing the Protocols for Native American Archival Materials at Northern Arizona University

2021-03-16 Thread Jessica Crouch
Dear ArchivesSpace users,



ArchivesSpace will be offering a webinar on Implementing the Protocols for 
Native American Archival Materials at Northern Arizona University. In this 
webinar, Sam Meier, Cindy Summers, and Liz Garcia will discuss the ongoing 
efforts of Northern Arizona University’s Cline Library Special Collections and 
Archives to implement the Protocols for Native American Archival 
Materials; their vision for applying 
the Protocols to archival arrangement and description as well as rearrangement 
and redescription; logistical, technological, and resource issues with regards 
to updating legacy archival description in ArchivesSpace; and the design, 
implementation, and progress to date of the Archival Description Internship 
developed to help make this work possible.



When: April 14, 2021


Time: 2:00 p.m. – 3:00 p.m. ET (11:00 a.m. – 12:00pm PT)


Where: Zoom



Registration: https://lyrasis.zoom.us/webinar/register/WN_THgc9aVTSDK4y0GygvSLeQ



This webinar will be recorded and made available on the ArchivesSpace YouTube 
channel.



Background information:


In April 2006, Northern Arizona University (NAU) in Flagstaff, Arizona hosted a 
gathering of Native American and non-Native American cultural heritage 
professionals who together drafted the Protocols for Native American Archival 
Materials, a professional best 
practices document which outlined guidelines for culturally responsive care of 
Native American archival materials held by non-tribal institutions. NAU’s Cline 
Library and Special Collections and Archives formally endorsed the Protocols in 
2006. Since then, the staff members of Cline Library Special Collections and 
Archives (SCA) have sought to integrate the guidance put forth by the Protocols 
into all aspects of their work, including collection development, collections 
management, and archival arrangement and description.



In 2019, SCA’s newly hired Archivist for Discovery, Sam(antha) Meier, began 
revising the department’s draft Arrangement and Description Policy to address 
issues common to academic archives and special collections, such as an 
extensive and growing backlog of unprocessed materials and outdated and 
inaccurate legacy description. Supported by colleagues at Arizona State 
University, she began to explore the possibility of using ArchivesSpace to more 
rapidly gain collection-level control over new acquisitions, update existing 
legacy finding aids, and transition the department’s EAD finding aids hosted in 
Arizona Archives Online from EAD 2002 to EAD3. During the COVID-19 pandemic, 
Meier collaborated with Library Assistant Manager Cindy Summers to begin a 
holistic review of SCA’s legacy finding aids to prepare for their eventual 
ingest into ArchivesSpace for revision and correction. Meier and Summers found 
ways to continue this critical work remotely, as neither were initially working 
in Cline Library.



Cognizant of the need to implement the Protocols at every step in archival 
processing, including re-description, re-arrangement, and re-processing, in the 
winter of 2020 Meier and Summers developed a fully remote paid graduate 
internship, the Archival Description Internship, intended to support an MLIS 
student at the University of Arizona in Tucson, Arizona. The internship was 
designed to function as a "pilot project" for the department, allowing Meier 
and Summers to explore how to apply the Protocols to their legacy finding aids 
along with intern Liz Garcia.



Presenter Information:



Elizabeth (Liz) Garcia is an MLIS student and Knowledge River Scholar at the 
University of Arizona’s School of Information with a concentration in archival 
studies. In addition to her work as the Archival Description Intern at Cline 
Library Special Collections and Archives, she is also the Graduate Assistant 
Archives Specialist at the University of Arizona Libraries Special Collections. 
Her research interests include archival theory, social justice in libraries and 
archives, and digital curation.

Cindy Summers is an Assistant Manager at Cline Library Special Collections and 
Archives where she supervises volunteers, student employees, and interns. Cindy 
works closely with the Archivist for Discovery (Samantha Meier) to oversee 
collections processing, which has led to the ambitious project of updating all 
of SCA’s legacy finding aids and ingesting them in ArchivesSpace. Despite her 
employment history in many different units of Cline Library over 23 years, she 
feels fortunate to have become a staff member of Special Collections and 
Archives where she is surrounded by fascinating history, inspired by incredible 
coworkers, and encouraged by the aspirations of future archivists.



Sam(antha) Meier is the Archivist for Discovery at Cline Library Special 
Collections and Archives, where she oversees arrangement and description of the 
department’s archival 

[Archivesspace_Users_Group] Nominations Requested for ArchivesSpace User Advisory Council

2021-03-16 Thread Gordon Daines
The ArchivesSpace Governance Board is seeking nominations to fill up to three 
(3) vacancies on the ArchivesSpace User Advisory Council (UAC).

The User Advisory Council 
(https://archivesspace.org/governance-board-and-councils#UAC) is a critical 
part of the ArchivesSpace community, serving as a communication conduit between 
ArchivesSpace governance groups and ArchivesSpace users. Some of the activities 
UAC is currently engaged in are:

  *   Advising the ArchivesSpace Governance Board and the ArchivesSpace 
Organizational Home on the design and delivery of services, such as community 
engagement and support services, technical support, and training.
  *   As a joint representative of the Development Prioritization subteam, 
discussing and voting on ideas for software enhancements and improvements.
  *   As a joint representative of the Testing subteam, conducting 
user-centered testing of the application prior to releases and conduct ongoing 
usability studies as needed.
  *   Liaising with national and regional archives organizations.
  *   Maintaining and updating user documentation.
  *   Advocating on the usability and functionality of the software program to 
facilitate adoption of the program.

Nominees should be presently employed at an ArchivesSpace member institution 
(please refer to the categorized member list at: 
 
http://archivesspace.org/community/whos-using-archivesspace/). As the community 
for a software application used in many different settings, we seek diverse 
voices in working to improve ArchivesSpace and support those who use or may 
benefit from using it. We especially encourage Black, Indigenous, and People of 
Color, and others currently less represented in our governance groups to 
consider joining us in these important leadership roles.

The anticipated time commitment for each appointee is expected to be two hours 
per week on average.

The term of service will be July 1, 2021-June 30, 2023. Each new appointee will 
be eligible to have her/his appointment renewed for an additional two-year 
term, i.e., July 1, 2023-June 30, 2025.

To nominate a candidate for the ArchivesSpace User Advisory Council, please 
submit this form at https://forms.gle/m2nN49M34CPgnvci9. The Nominating 
Committee will review all nominations and recommend appointments to the 
Governance Board for approval.

Nominations must be received by 9:00 p.m. EDT on Friday April 23rd. Please 
contact Christine Di Bella,  ArchivesSpace Program Manager, at 
christine.dibe...@lyrasis.org or any 
member of the Nominating Committee with questions.

Thank you for your participation in this important process, which is an 
essential part of our identity and operations as a community-based software 
organization.

Respectfully,

J. Gordon Daines III
Brigham Young University
Chair, and Governance Board member

On behalf of Nominating Committee members:

Brittany Newberry, Robert W. Woodruff Library, Atlanta University Center, Chair 
of User Advisory Council
Trevor Thornton, North Carolina State University, Chair of Technical Advisory 
Council
Zach Johnson, Vanderbilt University, representing very large membership level
Jacky Johnson, Miami University, representing large membership level
Heidi Buchanan, West Carolina University, representing medium membership level
Patricia Burdick, Colby College, representing small membership level
Yue Ma, Museum of Chinese in America, representing very small membership level
Christine Di Bella (ArchivesSpace Program Manager), ex officio

___
J. Gordon Daines III
Curator of Research and Instruction Services
Curator of the Yellowstone National Park collection
1130 HBLL
Brigham Young University
Provo, UT 84602
801-422-5821
gordon_dai...@byu.edu

___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


[Archivesspace_Users_Group] Nominations Requested for ArchivesSpace Technical Advisory Council

2021-03-16 Thread Gordon Daines
The ArchivesSpace Governance Board is seeking nominations to fill up to three 
(3) vacancies on the ArchivesSpace Technical Advisory Council (TAC).

The Technical Advisory Council 
(https://archivesspace.org/governance-board-and-councils#TAC) is a critical 
part of the ArchivesSpace community, having responsibility for providing 
technical guidance to individuals or organizations contributing to application 
development, to the User Advisory Council, and to the ArchivesSpace Governance 
Board. TAC's current activities include:

  *   Reviewing enhancements and priorities and testing in collaboration with 
the User Advisory Council.
  *   Providing support for migrating data to ArchivesSpace from other systems 
and support for importing and exporting data in formats such as EAD, MARCXML, 
and CSV.
  *   Documenting the metadata standards used by ArchivesSpace and monitoring 
the standards landscape.
  *   Identifying integration points for ArchivesSpace with other systems (e.g. 
digital asset management systems, patron and request management systems, etc.), 
creating resources to assist the community with integration work and, for 
specific integrations, developing technical requirements.
  *   Maintaining and updating technical documentation, including documentation 
of the API.
  *   Providing support and resources to help develop a community of code 
committers.

Nominees should preferably be presently employed at an ArchivesSpace member 
institution (please refer to the member list at: 
http://archivesspace.org/community/whos-using-archivesspace/) or employees of a 
current Registered Service Provider (see: 
http://archivesspace.org/registered-service-providers/current-rsps/). 
Nominations from non-member institutions will be considered based on their 
expertise and ability to contribute to the TAC. As the community for a software 
application used in many different settings, we seek diverse voices in working 
to improve ArchivesSpace and support those who use or may benefit from using 
it. We especially encourage Black, Indigenous, and People of Color, and others 
currently less represented in our governance groups to consider joining us in 
these important leadership roles.

The anticipated time commitment for each appointee is expected to be two hours 
per week on average.

The term of service will be July 1, 2021-June 30, 2023. Each new appointee will 
be eligible to have her/his appointment renewed for an additional two-year 
term, i.e., July 1, 2023-June 30, 2025.

To nominate a candidate for the ArchivesSpace Technical Advisory Council, 
please submit this form at https://forms.gle/m2nN49M34CPgnvci9. The Nominating 
Committee will review all nominations and recommend appointments to the 
Governance Board for approval.

Nominations must be received by 9:00 p.m. EDT on Friday April 23rd. Please 
contact Christine Di Bella,  ArchivesSpace Program Manager, at 
christine.dibe...@lyrasis.org or any 
member of the Nominating Committee with questions.

Thank you for your participation in this important process, which is an 
essential part of our identity and operations as a community-based software 
organization.

Respectfully,

J. Gordon Daines III
Brigham Young University
Chair, and Governance Board member

On behalf of Nominating Committee members:

Brittany Newberry, Robert W. Woodruff Library, Atlanta University Center, Chair 
of User Advisory Council
Trevor Thornton, North Carolina State University, Chair of Technical Advisory 
Council
Zach Johnson, Vanderbilt University, representing very large membership level
Jacky Johnson, Miami University, representing large membership level
Heidi Buchanan, West Carolina University, representing medium membership level
Patricia Burdick, Colby College, representing small membership level
Yue Ma, Museum of Chinese in America, representing very small membership level
Christine Di Bella (ArchivesSpace Program Manager), ex officio

___
J. Gordon Daines III
Curator of Research and Instruction Services
Curator of the Yellowstone National Park collection
1130 HBLL
Brigham Young University
Provo, UT 84602
801-422-5821
gordon_dai...@byu.edu

___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


[Archivesspace_Users_Group] Nominations Requested for ArchivesSpace Governance Board

2021-03-16 Thread Gordon Daines
As chair of the 2021 ArchivesSpace Nominating Committee, I seek your 
nominations of candidates to stand for election for the ArchivesSpace 
Governance Board representing the Small and Very Large membership levels. The 
Governance Board is charged with providing oversight for the ArchivesSpace 
application and community, including fiscal oversight, the approval of 
development priorities, and oversight of the Technical and User Advisory 
councils.

We are seeking two candidates for the Small level and one candidate for the 
Very Large level. (The current incumbent at the Very Large membership level is 
running for re-election as representative.) The terms for all successful 
candidates will be July 1, 2021-June 30, 2024. Each newly elected member 
representative will be eligible to run for a second term, July 1, 2024-June 30, 
2027.

Governance Board nominees must be currently employed by an ArchivesSpace member 
organization in either the Small or Very Large member levels (see the member 
lists at: http://archivesspace.org/community/whos-using-archivesspace/). A 
variety of skills, interests, and experience can be useful in furthering the 
Board's mission, but nominees should ideally have administrative and fiscal 
oversight experience. As the community for a software application used in many 
different settings, we seek diverse voices in working to improve ArchivesSpace 
and support those who use or may benefit from using it. We especially encourage 
Black, Indigenous, and People of Color, and others currently less represented 
in our governance groups to consider joining us in these important leadership 
roles.

To submit nominations, including self-nominations, please submit this form at 
https://forms.gle/m2nN49M34CPgnvci9. Nominees who confirm their interest in 
standing for election will be requested to supply a formal candidate statement 
outlining their qualifications for serving on the ArchivesSpace Governance 
Board (see: http://archivesspace.org/about/governance-board-and-councils/). The 
Nominating Committee will review all nominations and confirm a slate of 
nominees with the current Governance Board. An election ballot will be 
circulated to member institutions in mid-May.

Nominations must be received by 9:00 p.m. EDT on Friday April 23rd. Please 
contact Christine Di Bella,  ArchivesSpace Program Manager, at 
christine.dibe...@lyrasis.org or any 
member of the Nominating Committee with questions.

Thank you for your participation in this important process, which is an 
essential part of our identity and operations as a community-based software 
organization.

Respectfully,

J. Gordon Daines III
Brigham Young University
Chair, and Governance Board member

On behalf of Nominating Committee members:

Brittany Newberry, Robert W. Woodruff Library, Atlanta University Center, Chair 
of User Advisory Council
Trevor Thornton, North Carolina State University, Chair of Technical Advisory 
Council
Zach Johnson, Vanderbilt University, representing very large membership level
Jacky Johnson, Miami University, representing large membership level
Heidi Buchanan, West Carolina University, representing medium membership level
Patricia Burdick, Colby College, representing small membership level
Yue Ma, Museum of Chinese in America, representing very small membership level
Christine Di Bella (ArchivesSpace Program Manager), ex officio


___
J. Gordon Daines III
Curator of Research and Instruction Services
Curator of the Yellowstone National Park collection
1130 HBLL
Brigham Young University
Provo, UT 84602
801-422-5821
gordon_dai...@byu.edu

___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


[Archivesspace_Users_Group] FW: Introducing the LYRASIS DataCite US Community

2021-03-16 Thread Christine Di Bella
Forwarded on behalf of our colleague Sheila Rabun at LYRASIS. Please contact 
Sheila (sheila.ra...@lyrasis.org) if you have 
questions.

---

Greetings all,

We are pleased to announce the official launch of the LYRASIS DataCite US 
Community, a 
national consortium/community of practice for non-profit organizations in the 
US dedicated to creating and maintaining DOIs (Digital Object Identifiers) for 
special collections, unique cultural heritage materials, research data, and 
other scholarly content via the DataCite DOI 
registration agency. This new program provides membership cost sharing, 
dedicated support, and a shared pathway for organizations to support open 
research infrastructure by using DOIs to make scholarly content and research 
materials more FAIR: Findable, 
Accessible, Interoperable, and Reusable.

Please join us for a free informational webinar "Introducing the LYRASIS 
DataCite US Community" to learn more about the benefits of DOIs, participation, 
and program details:

  *   When: Thurs. April 22, 12-1pm Pacific/ 1-2pm Mountain/ 2-3pm Central/ 
3-4pm Eastern
  *   Registration and details: 
https://www.lyrasis.org/Content/Pages/Event-Details.aspx?Eid=E0361D0F-F281-EB11-80EF-00155D0A2721
  *   Note: All registrants will receive a link to the webinar recording.

This program is designed to serve any non-profit institution in the US that is 
not already a DataCite member, and your organization does not need to be a 
LYRASIS member to participate. Additional information can be found online at 
https://www.lyrasis.org/programs/Pages/DataCite-US-Community-Membership.aspx. 
Feel free to contact me directly with any questions: 
sheila.ra...@lyrasis.org

Many thanks,
Sheila


Sheila Rabun, MA, MLIS [cid:image001.gif@01D7199F.34670460]  
https://orcid.org/-0002-1196-6279
Program Leader, Persistent Identifier Communities
Web: http://orcid-us.org
Twitter: https://twitter.com/USconsortium
Phone: 1-800-999-8558 x4809
Pronouns: she/her/hers
Recognize me: https://rescognito.com/-0002-1196-6279

[cid:image002.jpg@01D7199F.34670460]
[cid:image003.png@01D7199F.34670460]

I live, work, and learn on Kalapuya Ilihi, the traditional indigenous homeland 
of the Kalapuya people. Learn more about the Kalapuya 
people, 
their history, and the history of this land. Learn about land 
acknowledgments and ways to be in solidarity 
with Native nations beyond land acknowledgments.

___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


Re: [Archivesspace_Users_Group] PUI indexing issues

2021-03-16 Thread Tom Hanstra
Thanks for the suggestion, Blake. A couple additional questions:

- Staff indexing is done and has its files written. So does the number of
threads given to that make a difference? Is it still taking up resources?

- Does there happen to be any way to stop the staff indexing and just let
PUI have full access to the server for indexing?

- What really bothers me is the slowdown. That indicates to me that some
resource is being lost along the way. Anyone have thoughts on what that
might be?

- Should our repositories be broken up into smaller groupings?  I'm
beginning to wonder if we have things set up incorrectly, since it sounds
like we have a very large data set compared to others.


And a comment

It is really frustrating to have to start over on the indexing each time.
It seems that there should be some way to document progress along the way
so that the indexing can pick up where it left off. Is that something that
might also be looked at?

Thanks all. Appreciate your help.

Tom


On Tue, Mar 16, 2021 at 1:15 PM Blake Carver 
wrote:

> > I've now left my PUI indexing threads and count at the default (which I
> believe is 1 thread and 25 records/thread).
>
> Try dropping both indexer_records_per_thread and indexer_thread_count for
> both PUI and Staff indexers. Maybe in half or so. Sometimes with larger
> records it just needs to be slowed down.
> --
> *From:* archivesspace_users_group-boun...@lyralists.lyrasis.org <
> archivesspace_users_group-boun...@lyralists.lyrasis.org> on behalf of Tom
> Hanstra 
> *Sent:* Tuesday, March 16, 2021 12:51 PM
> *To:* Archivesspace Users Group <
> archivesspace_users_group@lyralists.lyrasis.org>
> *Subject:* [Archivesspace_Users_Group] PUI indexing issues
>
> Hello again.
>
> I'm still trying to understand some indexing issues. I've now left my PUI
> indexing threads and count at the default (which I believe is 1 thread and
> 25 records/thread). And I have given 4GB to Java processes. I've tried
> other values as well, but with similar results.
>
> No matter what values I use, I cannot seem to fully index PUI. Each time,
> it will start well but continuously slow down. I've kept a spreadsheet of
> the number of records/hr I'm indexing and have several attempts which start
> in the 50-60K/hr range and then continuously slow down to the 1800-1500/hr
> speed until finally dying with a Java Heap error. I think I'm headed to
> that again this round.
>
> Why might this be happening?  Could my data have been corrupted during the
> transfer from Lyrasis? (I'm working with a database export of our
> production data). Is the database too far away (our database is in an AWS
> RDS being accessed from our AWS EC2).
>
> I do have one log which gave this error:
>
> E, [2021-03-12T18:14:53.886243 #2919] ERROR -- : Thread-9472: Failed
> fetching archival_object id=1484623: too many connection resets (due to
> Net::ReadTimeout - Net::ReadTimeout) after 0 requests on 3150, last used
> 1615590893.870297 seconds
> ago
>
> prior to the Java Heap error. In that log, there were a number of
> connections for the staff indexer after the PUI indexer stopped reporting,
> then an 88 minute gap prior to the above connection error and then finally
> a Java Heap error in the archivesspace.out log.
>
> Does the indexer reauthenticate each time it connects to get more
> information?  The earlier question about authentication has me wondering if
> my database server might be balking at the number of reconnections or
> something. I'm trying to index 760K records.
>
> Bottom line is that I'm still not getting my PUI index creation to
> complete. Each run can take several days before it finally fails and I have
> to start all over again.  I'm looking for any help to track down why this
> slowdown is occurring and what I can do to address it.
>
> Thanks,
> Tom
> --
> *Tom Hanstra*
> *Sr. Systems Administrator*
> hans...@nd.edu
>
>
> ___
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group@lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>


-- 
*Tom Hanstra*
*Sr. Systems Administrator*
hans...@nd.edu
___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


[Archivesspace_Users_Group] PUI indexing issues

2021-03-16 Thread Tom Hanstra
Hello again.

I'm still trying to understand some indexing issues. I've now left my PUI
indexing threads and count at the default (which I believe is 1 thread and
25 records/thread). And I have given 4GB to Java processes. I've tried
other values as well, but with similar results.

No matter what values I use, I cannot seem to fully index PUI. Each time,
it will start well but continuously slow down. I've kept a spreadsheet of
the number of records/hr I'm indexing and have several attempts which start
in the 50-60K/hr range and then continuously slow down to the 1800-1500/hr
speed until finally dying with a Java Heap error. I think I'm headed to
that again this round.

Why might this be happening?  Could my data have been corrupted during the
transfer from Lyrasis? (I'm working with a database export of our
production data). Is the database too far away (our database is in an AWS
RDS being accessed from our AWS EC2).

I do have one log which gave this error:

E, [2021-03-12T18:14:53.886243 #2919] ERROR -- : Thread-9472: Failed
fetching archival_object id=1484623: too many connection resets (due to
Net::ReadTimeout - Net::ReadTimeout) after 0 requests on 3150, last used
1615590893.870297 seconds
ago

prior to the Java Heap error. In that log, there were a number of
connections for the staff indexer after the PUI indexer stopped reporting,
then an 88 minute gap prior to the above connection error and then finally
a Java Heap error in the archivesspace.out log.

Does the indexer reauthenticate each time it connects to get more
information?  The earlier question about authentication has me wondering if
my database server might be balking at the number of reconnections or
something. I'm trying to index 760K records.

Bottom line is that I'm still not getting my PUI index creation to
complete. Each run can take several days before it finally fails and I have
to start all over again.  I'm looking for any help to track down why this
slowdown is occurring and what I can do to address it.

Thanks,
Tom
-- 
*Tom Hanstra*
*Sr. Systems Administrator*
hans...@nd.edu
___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


Re: [Archivesspace_Users_Group] Java error - Java::JavaSql::SQLException: HOUR_OF_DAY: 2 -> 3

2021-03-16 Thread Brian Hoffman
I created an issue for this in JIRA with references to the information shared 
by Megan and James.

https://archivesspace.atlassian.net/browse/ANW-1229

Brian

From:  on behalf of 
James Bullen 
Reply-To: Archivesspace Users Group 

Date: Monday, March 15, 2021 at 7:49 PM
To: Archivesspace Users Group 
Subject: Re: [Archivesspace_Users_Group] Java error - 
Java::JavaSql::SQLException: HOUR_OF_DAY: 2 -> 3


Many thanks Megan for the lead. I just poked around and found a few things out.

It turns out to be interesting in at least three bad ways.

1.  User records are updated every time they authenticate. This is unnecessary 
because it is just updating their authentication source, which usually never 
changes. So this line could be made conditional on the source being different:
https://github.com/archivesspace/archivesspace/blob/master/backend/app/model/authentication_manager.rb#L40

2.  The search_indexer user is being re-authenticated a lot. This shouldn’t 
happen because it should be a non-expiring session. I haven’t looked into this 
one, but I did check the lock_version on the search_inxdexer user on a 
large-ish AS db I have lying around, that has been in production for less than 
a year and it is over 100,000. That’s nuts. Actually I just took a quick look 
and saw this:
https://github.com/archivesspace/archivesspace/blob/master/indexer/app/lib/periodic_indexer.rb#L285
The session is reset if there is an error during an indexing round. So that 
could be the culprit.

3.  There is clearly a bug somewhere in the stack, probably in the Sequel 
layer, because it shouldn’t crash the system just because you have a whacky 
datetime somewhere. This bug may well have been fixed in a more recent version 
of that component.


Dealing with 1 will likely fix the problem, but 2 and 3 should also be 
investigated.


Cheers,
James




On Mar 16, 2021, at 1:24 AM, Tom Hanstra 
mailto:hans...@nd.edu>> wrote:

Thanks, Megan. That update did fix things.

So, is this not something that can be addressed at the software fix level? Do 
other sites simply stop indexing or turn something else off when we hit DST in 
March?

Tom

On Mon, Mar 15, 2021 at 10:12 AM Schanz, Megan 
mailto:schan...@msu.edu>> wrote:
I run across this every March for daylight savings. This is what I have in my 
notes to do each time if the indexer is running. Luckily it seems like I didn't 
have this scenario this year.

java.lang.IllegalArgumentException: HOUR_OF_DAY: 2 -> 
3:Java::JavaLang::IllegalArgumentException:
This error will appear in the logs when trying to start ArchivesSpace after 
daylight savings time. Reference: 
http://lyralists.lyrasis.org/mailman/htdig/archivesspace_users_group/2019-March/006652.html
Basically, the indexer user is being updated in the database with a time that 
does not exist due to daylight savings time.
To verify:

SELECT * FROM user WHERE (user_mtime >= '2021-03-14 02:00:00' and user_mtime <= 
'2021-03-14 03:00:00') OR (system_mtime >= '2021-03-04 02:00:00' and 
system_mtime <= '2021-03-14 03:00:00');
The record that should come back is the search_indexer user.
To resolve:

UPDATE user set user_mtime = NOW(), system_mtime=NOW() where 
username='search_indexer';


From: 
archivesspace_users_group-boun...@lyralists.lyrasis.org
 
mailto:archivesspace_users_group-boun...@lyralists.lyrasis.org>>
 on behalf of Tom Hanstra mailto:hans...@nd.edu>>
Sent: Monday, March 15, 2021 9:10 AM
To: Archivesspace Users Group 
mailto:archivesspace_users_group@lyralists.lyrasis.org>>
Subject: Re: [Archivesspace_Users_Group] Java error - 
Java::JavaSql::SQLException: HOUR_OF_DAY: 2 -> 3

Any suggestion on what ArchivesSpace might have changed?  I had the server 
running but indexing was complete. What might it have been changing and in what 
database table would I look for that change?

Alternately, since this is still test data, should I just overlay a backup copy 
of the database from earlier than the EDT cutover time?  I don't know what 
affect that might have on other portions of ArchivesSpace.

Finally, what is the overall fix for this issue. If others have seen it, what 
can be done to make sure to avoid it in the future?

Thanks,
Tom

On Mon, Mar 15, 2021 at 9:03 AM Blake Carver 
mailto:blake.car...@lyrasis.org>> wrote:
https://gist.github.com/Blake-/d493da28be5554a49a3a3835bbd98f05

You'll want to find the date more like '2021-03-14 02:00%' or would it be 
03-13? Whatever the date was this year.
Find any date with a time between 2-3am and just change it to any real hour.
ArchivesSpace did "something" (probably restarted?) at a bad time on Sunday 
morning and wrote a time that never happened.

From: 

Re: [Archivesspace_Users_Group] Past Perfect migration?

2021-03-16 Thread Solek, VivianLea
If you have questions about importing your data (not XML), I know you can 
export out of PastPerfect into a CSV excel file.  With version 2.8, you can 
import that directly into ASpace, after you have migrated the data to the 
import template.  Thanks to the work of Bobbi Fox and others, what used to be a 
plug-in for importing excel spreadsheets, is part of the core code with version 
2.8.

Good luck with your migration.

Best,
VivianLea

VivianLea Solek
Archivist
Knights of Columbus Supreme Council Archives

1 State Street
New Haven, CT 06511-6702
Phone 203 752-4578
Fax 203 865-0351

From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
 On Behalf Of Celia 
Caust-Ellenbogen
Sent: Tuesday, March 16, 2021 12:01 PM
To: Archivesspace Users Group 
Subject: Re: [Archivesspace_Users_Group] Past Perfect migration?

I would recommend checking out the work that Lindsey Loeper did at UMBC to 
convert PastPerfect XML to EAD, presuming you could easily import the EAD into 
ASpace but I haven't tried it myself. See 
https://www2.archivists.org/sites/all/files/Module_18_CaseStudy_LindseyLoeper.pdf,
 
https://github.com/UMBC-Library/EAD-XML,
 etc.

On Wed, Mar 3, 2021 at 9:44 AM Novak, Miranda 
mailto:mno...@csbsju.edu>> wrote:
Hello,

My institution is brand-new to ArchivesSpace and has been using Past Perfect 
for 20+ years. I'm hoping to be directed to documentation to help me plan how 
to migrate data from Past Perfect into ArchivesSpace. Has anyone already done 
this and could you point me to the documents/videos that helped you?

Thank you and apologies if this isn't an appropriate question for this group!

Sincerely,
Miranda

Miranda Novak
Assistant Director of Instructional Technology
Clemens & Alcuin Libraries, IT Services
College of Saint Benedict/Saint John's University
320-363-5923
mno...@csbsju.edu
Pronouns: she/her/hers

___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group


--
Celia Caust-Ellenbogen 
(she/her/hers)
Friends Historical Library of Swarthmore 
College
ccaus...@swarthmore.edu
Please be aware that Swarthmore College is currently closed to visitors and all 
but essential staff due to coronavirus precautions. I am primarily working from 
home and spending only limited time in the office. Email is the best way to 
reach me.
CONFIDENTIALITY NOTICE: This message and any attachments may contain 
confidential, proprietary or legally privileged information and is intended 
only for the use of the addressee or addressees named above for its intended 
purpose. If you are not the intended recipient of this message, this message 
constitutes notice that any review, retransmission, distribution, copying or 
other use or taking any action 

Re: [Archivesspace_Users_Group] Past Perfect migration?

2021-03-16 Thread Celia Caust-Ellenbogen
I would recommend checking out the work that Lindsey Loeper did at UMBC to
convert PastPerfect XML to EAD, presuming you could easily import the EAD
into ASpace but I haven't tried it myself. See
https://www2.archivists.org/sites/all/files/Module_18_CaseStudy_LindseyLoeper.pdf
, https://github.com/UMBC-Library/EAD-XML, etc.

On Wed, Mar 3, 2021 at 9:44 AM Novak, Miranda  wrote:

> Hello,
>
>
>
> My institution is brand-new to ArchivesSpace and has been using Past
> Perfect for 20+ years. I’m hoping to be directed to documentation to help
> me plan how to migrate data from Past Perfect into ArchivesSpace. Has
> anyone already done this and could you point me to the documents/videos
> that helped you?
>
>
>
> Thank you and apologies if this isn’t an appropriate question for this
> group!
>
>
>
> Sincerely,
>
> Miranda
>
>
>
> Miranda Novak
>
> Assistant Director of Instructional Technology
>
> Clemens & Alcuin Libraries, IT Services
>
> College of Saint Benedict/Saint John’s University
>
> 320-363-5923
>
> mno...@csbsju.edu
>
> Pronouns: she/her/hers
>
>
> ___
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group@lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>


-- 
Celia Caust-Ellenbogen (she/her/hers
)
Friends Historical Library of Swarthmore College

ccaus...@swarthmore.edu
Please be aware that Swarthmore College is currently closed to visitors and
all but essential staff due to coronavirus precautions. I am primarily
working from home and spending only limited time in the office. Email is
the best way to reach me.
___
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group