What Kate said, and also: for anything that we figured out during our migration, you can check individual EAD files with somewhat better error reporting by putting them through https://eadchecker.lib.harvard.edu/
-- Dave Mayo (he/him) Senior Digital Library Software Engineer Harvard University > HUIT > LTS From: <archivesspace_users_group-boun...@lyralists.lyrasis.org> on behalf of "Bowers, Kate A." <kate_bow...@harvard.edu> Reply-To: Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org> Date: Tuesday, January 29, 2019 at 2:21 PM To: Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org> Subject: Re: [Archivesspace_Users_Group] Import Data Errors EAD has fewer contraints than ArchivesSpace. Those error messages are not terribly helpful. At the very end of this article (before the end notes) is a list of the constraints in AS that will throw an error if your EAD does not comply: https://journal.code4lib.org/articles/12239 I think the messages are coming from what we have as number 26 on the list “Absence of both unittitle and unitdate at a subordinate level causes import to fail”. Our solution was to grab data via a script from the parent <did>. You may have few enough to do this by hand. So, for example, we had <did>s that contained only <head> or only <note> or only <united>. These were all valid EAD but not acceptable to ArchivesSpace. Below is a bit more digestible (but still Harvard-centric and definitely a working document) list of these issues and the practice that is needed going forward to ensure AS will ingest EAD. Your error messages are coming from the 3rd thing listed under <c>. Element category Practice going forward Reason Reason category Schema Use the canonical EAD schema, not the Harvard modified schema AS expects data that conforms to the canonical schema. Schema <frontmatter> Do not use frontmatter This will be added by EAD export from AS Former practice no longer needed <language> Always include Required for saving in AS, will ingest, but annoying to users who edit resources. <descgrp> Do not use descgrp This will not load to AS Practice not supported by AS EAD ingest <arrangement> Do not embed <arrangement> within <scopecontent> Roundtripped EAD from AS will be invalid Practice not supported by AS EAD export <c> Always include an @level attribute value on all <c>s. If using @level="otherlevel" always include an @otherlevel value. <c> without a @level will ingest as "otherlevel" but lack @otherlevel value New attribute data required <c> All components should have <unittitle>; in cases where formerly archivsts might have used only <unitdate>, the parent <c>'s unittitle is often a good choice The component-centered display in ArchivesSpace makes any component lacking a the context provided by <unittitle> text vague and cryptic, hampering recognition and interpretation of the component. New content strongly recommended <c> All components must have either a <unitdate> or a <unittitle> EAD lacking this data will not load to AS New content required <chronlist> <chronlist> should stand alone, not be embedded within <bioghist> Will load twice into AS Practice not supported by AS EAD ingest <container> <container> must include type and label attributes, cannot describe multiple containers in one container element, and should not include type of container as part of content. AS accommodates container numbers and types, but does not accommodate note-like container information. In addition, AS creates "top container" records based on EAD ingest. These are linked records. Placing a range of boxes, for example, in a single container element creates incorrect data about containers. Data model in AS is different from EAD <controlaccess> Do not encode <title> in <controlaccess> Finding aid will load, but data will be lost. Use <subject> or other appropriate element Practice not supported by AS EAD ingest <corpname> @role ??? Disappears on ingest <creation> <creation> statement should include ingest information Include ingest to AS in your creation statement, e.g. "Created in Oxygen on 2016-11-18; ingested to AS on 2016-12-12" New content required <dao> One Digital Object per Archival Object Automated linking to objects in the DRS is based on the ref_id of the Archival Object, which is used as an owner supplied name in DRS. New limit <dao> Supply a title for digital archival objects; use <unittitle> of parent <c> xlink@title attribute is required by AS ingest New attribute data required <dao> <daodesc>??? Disappears on ingest? <dao> To achieve thumbnails, <daogrp> must be coded thus: ???? ? <extent> Collection-level <physdesc><extent> is required EAD lacking this data will not load to AS New content required <extent> Do not encode mixed content within <physdesc> Finding aid will ingest, but content will be lost. Specifically, if a <physdesc> has some child elements, any text that is not inside a child element will be left behind during ingest. An entirely plain-text <physdesc> is OK. New limit <extent> <extent> must begin with a number EAD with non-numerical extent will not load to AS New limit <extref> <extref> should be used more sparingly, consider using only if @href values link Harvard-managed links Link rot (has nothing to do with AS), except that during migration, links became noticeable and rot was there New recommendation <index> Do not encode nested indexes Import may succeed, but data will be lacking from AS Practice not supported by AS EAD ingest <index> Instead of creating <index>es, add controlaccess terms to components This allows search and retrieval of all components across the whole corpus rather than in one finding aid. Better data model for discovery and retrieval <indexentry> Do not encode nested indexentries Import may succeed, but data will be lacking from AS Practice not supported by AS EAD ingest <list> Do not encode nested lists Import may succeed, but data will be lacking from AS Practice not supported by AS EAD ingest <list> Do not use <defitem> or <list type="deflist"> Import may succeed, but data will be lacking from AS Practice not supported by AS EAD ingest <name> Avoid <name> Import may succeed, but data will be lacking from AS, use a more specific <persname>, <corpname>, or <geogname> Practice not supported by AS EAD ingest <namegrp> Do not encode namegrps Import may succeed, but data will be lacking from AS Practice not supported by AS EAD ingest <note> Do not use <note> anywhere; where legal, use <odd> preferably with <head> Import may succeed, but <note> will be lacking from AS; <head> in <odd> provides a better label than "generic note" New limit <origination> Do not encode origination as mixed content; all data must be within child elements Import may succeed, but data will be lacking from AS. Any content not within the child elements <corpname>, <famname>, or <persname> will not go into AS. It will not stop ingest, but data will be lost. Attribute values will also be lost. Practice not supported by AS EAD ingest; new constraint <persname> @role ??? Disappears on ingest? <processinfo> Finding aid must have a <processinfo> with <head>Aleph ID</head> and content containing the Aleph record number for the collection Indexing of finding aids in Primo and connecting them with bibliographic records depends on this exact specification being carried out successfully. New content required <ref> <ref>???? Internal refs lost on ingest <table> Do not use <table> Longstanding practice to be continued Existing limit <unitdate> Always supply value for @normal attribute in <unitdate> This had formerly been accomplished through OASIS loader New attribute data required <unitdate> Supply certainty="approximate" value for dates if approximate New attribute data required <unitdate> Do not use @startYear @endYear These were Harvard-specific attributes and will get lost in AS ingest New limit <unitdate> Do not nest <unitdate> within <unittitle> These are un-nested during AS ingest. Starting with nested <unitdate>s in EAD will give archivists an unrealistic idea of what the description will convey when un-nested. New limit <unitdate> Use separate <unitdate> elements for bulk and inclusive dates, and indicate these differences by setting the @type attribute accordingly AS cannot ingest two dates from one <unitdate> tag. New limit <unitdate> Indicate approximation in <unitdate>s by setting the attribute @certainty="approximate" Circa or approximate as part of the date expression are not machine-actionable New recommendation <unitdate> If there are no dates, do not use <unitdate> at all Older practice often resulted in the following, which cannot be ingested by AS <unitdate>undated</unitdate>. Consider whether "undated" belongs as part of the title. New limit <unitid> Collection-level <unitid> is required EAD lacking this data will not load to AS New content required <unitid> Use only one <unitid>. If more than one <unitid> is needed, either place them in separate <c> elements or concatenate all into a single <unitid> AS will ingest the finding aid, but content will be lost. All but one of the <unitid>s will be lacking. New limit <unittitle> Collection-level <unittitle> is required EAD lacking this data will not load to AS New content required <unittitle> Use only one <unittitle>. If more than one <unittitle> is needed, either place them in separate <c> elements or concatenate all into a single <unittitle> AS will ingest the finding aid, but content will be lost. All but one of the <unittitle>s will be lacking. New limit <extent> All <extent> measurement types must come from same list used in AS; if non-canonical measurements are needed, consider a separate <physdesc> Non-matches will have two results: calculations based on measurements will be inaccurrate, AS extent drop-down will become cluttered New limit <bibliography> Avoid <bibliography>? <ptrgrp> avoid <ptrgrp>??? From: archivesspace_users_group-boun...@lyralists.lyrasis.org <archivesspace_users_group-boun...@lyralists.lyrasis.org> On Behalf Of Ryan Flahive Sent: Tuesday, January 29, 2019 12:17 PM To: Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org> Subject: [Archivesspace_Users_Group] Import Data Errors Morning Folks, I’m new to this group as my ArchiveSpace server was just set up a couple days ago. Due to circumstances too lengthy to describe here, I am manually building this database from exported EAD files from my Archivist’s Toolkit system. My first few imports were successful, but then a few failed due to these errors: Date: one or more required (or enter a Title) Title: must not be an empty string (or enter a Date) These records have titles and dates. Can anyone shed some light on how I resolve this issue? Feel free to email me with suggestions! Thanks! Ryan S. Flahive Archivist INSTITUTE OF AMERICAN INDIAN ARTS 83 Avan Nu Po Road, Santa Fe, NM 87508 P. 505-424-2392 E. rflah...@iaia.edu<mailto:rflah...@iaia.edu> www.iaia.edu<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.iaia.edu&d=DwMGaQ&c=WO-RGvefibhHBZq3fL85hQ&r=_Mv1dY22K7jvT5MD7xjbvGVzRDOUMhx4WYcnPSIzYnE&m=fH35Xl6mSv68OjLsoAwAZDlklqJhOwX-2PPeWpbhSC8&s=Z10mbeEMqOqFa159iOZ1PoaKVADFt6gU-1tqkD0jV3A&e=> IAIA's Mission: To empower creativity and leadership in Native arts and cultures through higher education, lifelong learning, and outreach.
_______________________________________________ Archivesspace_Users_Group mailing list Archivesspace_Users_Group@lyralists.lyrasis.org http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group