What Kate said, and also: for anything that we figured out during our 
migration, you can check individual EAD files with somewhat better error 
reporting by putting them through https://eadchecker.lib.harvard.edu/

--
Dave Mayo (he/him)
Senior Digital Library Software Engineer
Harvard University > HUIT > LTS

From: <archivesspace_users_group-boun...@lyralists.lyrasis.org> on behalf of 
"Bowers, Kate A." <kate_bow...@harvard.edu>
Reply-To: Archivesspace Users Group 
<archivesspace_users_group@lyralists.lyrasis.org>
Date: Tuesday, January 29, 2019 at 2:21 PM
To: Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Import Data Errors

EAD has fewer contraints than ArchivesSpace.  Those error messages are not 
terribly helpful. At the very end of this article (before the end notes) is a 
list of the constraints in AS that will throw an error if your EAD does not 
comply:
https://journal.code4lib.org/articles/12239

I think the messages are coming from what we have as number 26 on the list 
“Absence of both unittitle and unitdate at a subordinate level causes import to 
fail”. Our solution was to grab data via a script from the parent <did>. You 
may have few enough to do this by hand.  So, for example, we had <did>s that 
contained only <head> or only <note> or only <united>. These were all valid EAD 
but not acceptable to ArchivesSpace.

Below is a bit more digestible (but still Harvard-centric and definitely a 
working document) list of these issues and the practice that is needed going 
forward to ensure AS will ingest EAD.  Your error messages are coming from the 
3rd thing listed under <c>.

Element category

Practice going forward

Reason

Reason category

Schema

Use the canonical EAD schema, not the Harvard modified schema

AS expects data that conforms to the canonical schema.

Schema

<frontmatter>

Do not use frontmatter

This will be added by EAD export from AS

Former practice no longer needed

<language>

 Always include

 Required for saving in AS, will ingest, but annoying to users who edit 
resources.



<descgrp>

Do not use descgrp

This will not load to AS

Practice not supported by AS EAD ingest

<arrangement>

Do not embed <arrangement> within <scopecontent>

Roundtripped EAD from AS will be invalid

Practice not supported by AS EAD export

<c>

Always include an @level attribute value on all <c>s. If using 
@level="otherlevel" always include an @otherlevel value.

<c> without a @level will ingest as "otherlevel" but lack @otherlevel value

New attribute data required

<c>

All components should have <unittitle>; in cases where formerly archivsts might 
have used only <unitdate>, the parent <c>'s unittitle is often a good choice

The component-centered display in ArchivesSpace makes any component lacking a 
the context provided by <unittitle> text vague and cryptic, hampering 
recognition and interpretation of the component.

New content strongly recommended

<c>

All components must have either a <unitdate> or a <unittitle>

EAD lacking this data will not load to AS

New content required

<chronlist>

<chronlist> should stand alone, not be embedded within <bioghist>

Will load twice into AS

Practice not supported by AS EAD ingest

<container>

<container> must include type and label attributes, cannot describe multiple 
containers in one container element, and should not include type of container 
as part of content.

AS accommodates container numbers and types, but does not accommodate note-like 
container information. In addition, AS creates "top container" records based on 
EAD ingest. These are linked records. Placing a range of boxes, for example, in 
a single container element creates incorrect data about containers.

Data model in AS is different from EAD

<controlaccess>

Do not encode <title> in <controlaccess>

Finding aid will load, but data will be lost. Use <subject> or other 
appropriate element

Practice not supported by AS EAD ingest

<corpname>

@role ???

 Disappears on ingest



<creation>

<creation> statement should include ingest information

Include ingest to AS in your creation statement, e.g. "Created in Oxygen on 
2016-11-18; ingested to AS on 2016-12-12"

New content required

<dao>

One Digital Object per Archival Object

Automated linking to objects in the DRS is based on the ref_id of the Archival 
Object, which is used as an owner supplied name in DRS.

New limit

<dao>

Supply a title for digital archival objects; use <unittitle> of parent <c>

xlink@title attribute is required by AS ingest

New attribute data required

<dao>

<daodesc>???

 Disappears on ingest?



<dao>

To achieve thumbnails, <daogrp> must be coded thus: ????

 ?



<extent>

Collection-level <physdesc><extent> is required

EAD lacking this data will not load to AS

New content required

<extent>

Do not encode mixed content within <physdesc>

Finding aid will ingest, but content will be lost. Specifically, if a 
<physdesc> has some child elements, any text that is not inside a child element 
will be left behind during ingest. An entirely plain-text <physdesc> is OK.

New limit

<extent>

<extent> must begin with a number

EAD with non-numerical extent will not load to AS

New limit

<extref>

<extref> should be used more sparingly, consider using only if @href values 
link Harvard-managed links

Link rot (has nothing to do with AS), except that during migration, links 
became noticeable and rot was there

New recommendation

<index>

Do not encode nested indexes

Import may succeed, but data will be lacking from AS

Practice not supported by AS EAD ingest

<index>

Instead of creating <index>es, add controlaccess terms to components

This allows search and retrieval of all components across the whole corpus 
rather than in one finding aid.

Better data model for discovery and retrieval

<indexentry>

Do not encode nested indexentries

Import may succeed, but data will be lacking from AS

Practice not supported by AS EAD ingest

<list>

Do not encode nested lists

Import may succeed, but data will be lacking from AS

Practice not supported by AS EAD ingest

<list>

Do not use <defitem> or <list type="deflist">

Import may succeed, but data will be lacking from AS

Practice not supported by AS EAD ingest

<name>

Avoid <name>

Import may succeed, but data will be lacking from AS, use a more specific 
<persname>, <corpname>, or <geogname>

Practice not supported by AS EAD ingest

<namegrp>

Do not encode namegrps

Import may succeed, but data will be lacking from AS

Practice not supported by AS EAD ingest

<note>

Do not use <note> anywhere; where legal, use <odd> preferably with <head>

Import may succeed, but <note> will be lacking from AS; <head> in <odd> 
provides a better label than "generic note"

New limit

<origination>

Do not encode origination as mixed content; all data must be within child 
elements

Import may succeed, but data will be lacking from AS. Any content not within 
the child elements <corpname>, <famname>, or <persname> will not go into AS. It 
will not stop ingest, but data will be lost.  Attribute values will also be 
lost.

Practice not supported by AS EAD ingest; new constraint

<persname>

@role ???

 Disappears on ingest?



<processinfo>

Finding aid must have a <processinfo> with <head>Aleph ID</head> and content 
containing the Aleph record number for the collection

Indexing of finding aids in Primo and connecting them with bibliographic 
records depends on this exact specification being carried out successfully.

New content required

<ref>

<ref>????

 Internal refs lost on ingest



<table>

Do not use <table>

Longstanding practice to be continued

Existing limit

<unitdate>

Always supply value for @normal attribute in <unitdate>

This had formerly been accomplished through OASIS loader

New attribute data required

<unitdate>

Supply certainty="approximate" value for dates if approximate

New attribute data required

<unitdate>

Do not use @startYear @endYear

These were Harvard-specific attributes and will get lost in AS ingest

New limit

<unitdate>

Do not nest <unitdate> within <unittitle>

These are un-nested during AS ingest. Starting with nested <unitdate>s in EAD 
will give archivists an unrealistic idea of what the description will convey 
when un-nested.

New limit

<unitdate>

Use separate <unitdate> elements for bulk and inclusive dates, and indicate 
these differences by setting the @type attribute accordingly

AS cannot ingest two dates from one <unitdate> tag.

New limit

<unitdate>

Indicate approximation in <unitdate>s by setting the attribute 
@certainty="approximate"

Circa or approximate as part of the date expression are not machine-actionable

New recommendation

<unitdate>

If there are no dates, do not use <unitdate> at all

Older practice often resulted in the following, which cannot be ingested by AS 
<unitdate>undated</unitdate>. Consider whether "undated" belongs as part of the 
title.

New limit

<unitid>

Collection-level <unitid> is required

EAD lacking this data will not load to AS

New content required

<unitid>

Use only one <unitid>. If more than one <unitid> is needed, either place them 
in separate <c> elements or concatenate all into a single <unitid>

AS will ingest the finding aid, but content will be lost. All but one of the 
<unitid>s will be lacking.

New limit

<unittitle>

Collection-level <unittitle> is required

EAD lacking this data will not load to AS

New content required

<unittitle>

Use only one <unittitle>. If more than one <unittitle> is needed, either place 
them in separate <c> elements or concatenate all into a single <unittitle>

AS will ingest the finding aid, but content will be lost. All but one of the 
<unittitle>s will be lacking.

New limit

<extent>

All <extent> measurement types must come from same list used in AS; if 
non-canonical measurements are needed, consider a separate <physdesc>

Non-matches will have two results: calculations based on measurements will be 
inaccurrate, AS extent drop-down will become cluttered

New limit

<bibliography>

Avoid <bibliography>?





<ptrgrp>

avoid <ptrgrp>???








From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
<archivesspace_users_group-boun...@lyralists.lyrasis.org> On Behalf Of Ryan 
Flahive
Sent: Tuesday, January 29, 2019 12:17 PM
To: Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org>
Subject: [Archivesspace_Users_Group] Import Data Errors

Morning Folks,

I’m new to this group as my ArchiveSpace server was just set up a couple days 
ago.

Due to circumstances too lengthy to describe here, I am manually building this 
database from exported EAD files from my Archivist’s Toolkit system. My first 
few imports were successful, but then a few failed due to these errors:

Date: one or more required (or enter a Title)
Title: must not be an empty string (or enter a Date)

These records have titles and dates. Can anyone shed some light on how I 
resolve this issue? Feel free to email me with suggestions!

Thanks!

Ryan S. Flahive
Archivist
INSTITUTE OF AMERICAN INDIAN ARTS
83 Avan Nu Po Road, Santa Fe, NM 87508
P. 505-424-2392
E. rflah...@iaia.edu<mailto:rflah...@iaia.edu>
www.iaia.edu<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.iaia.edu&d=DwMGaQ&c=WO-RGvefibhHBZq3fL85hQ&r=_Mv1dY22K7jvT5MD7xjbvGVzRDOUMhx4WYcnPSIzYnE&m=fH35Xl6mSv68OjLsoAwAZDlklqJhOwX-2PPeWpbhSC8&s=Z10mbeEMqOqFa159iOZ1PoaKVADFt6gU-1tqkD0jV3A&e=>

IAIA's Mission: To empower creativity and leadership in Native arts and 
cultures through higher education, lifelong learning, and outreach.

_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group

Reply via email to