Re: [CODE4LIB] ArchivesSpace v1.0.7 Released [linked data]
On Mar 6, 2014, at 1:37 PM, Ethan Gruber ewg4x...@gmail.com wrote: Let me ask a more direct question. If participating in linked data is a “good thing”, then how do you — anybody here — suggest archivists (or librarians or museum curators) do that starting today? —Eric Morgan I think that RDFa provides the lowest barrier to entry. Using dcterms for publisher, creator, title, etc. is a good place to start, and if your collection (archival, library, museum) links to terms defined in LOD vocabulary systems (LCSH, Getty, LCNAF, whatever), output these URIs in the HTML interface and tag them in RDFa in such a way that they are semantically meaningful, e.g., a href=http://vocab.getty.edu/aat/300028569; rel=dcterms:formatmanuscripts (document genre)/a It would be great if content management systems supported RDFa right out of the box, and perhaps they are all moving in this direction. But you don't need a content management system to do this. If you generate static HTML files for your finding aids from EAD files using XSLT, you can tweak your XSLT output to handle RDFa. Ethan, thank you. Do other people have any ideas of how libraries, archives, and/or museums can start doing linked data now? And if not, then what do you think needs to happen before additional linked data publication systems can be implemented? — Eric Morgan
Re: [CODE4LIB] ArchivesSpace v1.0.7 Released [linked data]
I’m just curious. To what degree does ArchiveSpace support publishing content as linked data? Transforming EAD (or MARC) into serialized RDF is functional but not ideal for linked data, for many reasons. ArchiveSpace as a content management system may be more feasible. At the very least, something like D2RQ could be put on top of the ArchiveSpace database to expose the underlying content as RDF. What do you think? [1] D2RQ - http://d2rq.org — Eric Lease Morgan University of Notre Dame
Re: [CODE4LIB] ArchivesSpace v1.0.7 Released [linked data]
ArchivesSpace has a REST backend API, and requests yield a response in JSON. As one option, I'd investigate to publish linked data as JSON-LD. Some degree of mapping would be necessary, but I imagine it would be significantly easier to that instead of using something like D2RQ. Mark -- Mark A. Matienzo m...@matienzo.org Director of Technology, Digital Public Library of America On Thu, Mar 6, 2014 at 9:30 AM, Eric Lease Morgan emor...@nd.edu wrote: I’m just curious. To what degree does ArchiveSpace support publishing content as linked data? Transforming EAD (or MARC) into serialized RDF is functional but not ideal for linked data, for many reasons. ArchiveSpace as a content management system may be more feasible. At the very least, something like D2RQ could be put on top of the ArchiveSpace database to expose the underlying content as RDF. What do you think? [1] D2RQ - http://d2rq.org — Eric Lease Morgan University of Notre Dame
Re: [CODE4LIB] ArchivesSpace v1.0.7 Released [linked data]
On Mar 6, 2014, at 9:47 AM, Mark A. Matienzo mark.matie...@gmail.com wrote: ArchivesSpace has a REST backend API, and requests yield a response in JSON. As one option, I'd investigate to publish linked data as JSON-LD. Some degree of mapping would be necessary, but I imagine it would be significantly easier to that instead of using something like D2RQ. If I understand things correctly, using D2RQ to publish database contents as linked data is mostly a systems administration task: 1. download and install D2RQ 2. run D2RQ-specific script to read a (ArchiveSpace) database schema and create a configuration file 3. run D2RQ with the configuration file 4. provide access via standard linked data publishing methods 5. done If the field names in the initial database are meaningful, and if the database schema is normalized, then D2RQ ought to work pretty well. If many archives use ArchiveSpace, then the field names can become “standard” or at least “best practices”, and the resulting RDF will be well linked. I have downloaded and run ArchiveSpace on my desktop computer. It imported some of my EAD files pretty well. It created EAC-CPF files from my names. Fun. I didn’t see a way to export things as EAD. The whole interface is beautiful and functional. In my copious spare time I will see about configuring ArchiveSpace to use a MySQL backend (instead of the embedded database), and see about putting D2RQ on top. I think this will be easier than learning a new API and building an entire linked data publishing system. D2RQ may be an viable option with the understanding that no solution is perfect. — Eric Morgan
Re: [CODE4LIB] ArchivesSpace v1.0.7 Released [linked data]
The issue here that I see is that D2RQ will expose the MySQL database structure as linked data in some sort of indecipherable ontology and the end result is probably useless. What Mark alludes to here is that the developers of ArchivesSpace could write scripts, inherent to the platform, that could output linked data that conforms to existing or emerging standards. This is much simpler than introducing D2RQ into the application layer, and allows for greater control of the export models. As a developer of different, potentially competing, software applications for EAD and EAC-CPF publication, who is to say that ArchivesSpace database field names should be standards or best practices? These are things that should be determined by the archival community, not a software application. CIDOC-CRM is capable of representing the structure and relationships between components of an archival collection. I'm not a huge advocate of the CRM because I think it has a tendency to be inordinately complex, but *it* is a standard. Therefore, if the archival community decided that it would adopt CRM as its RDF data model standard, ArchivesSpace, ICA-AtoM, EADitor, and other archival management/description systems could adapt to the needs of the community and offer content in these models. Ethan On Thu, Mar 6, 2014 at 10:41 AM, Eric Lease Morgan emor...@nd.edu wrote: On Mar 6, 2014, at 9:47 AM, Mark A. Matienzo mark.matie...@gmail.com wrote: ArchivesSpace has a REST backend API, and requests yield a response in JSON. As one option, I'd investigate to publish linked data as JSON-LD. Some degree of mapping would be necessary, but I imagine it would be significantly easier to that instead of using something like D2RQ. If I understand things correctly, using D2RQ to publish database contents as linked data is mostly a systems administration task: 1. download and install D2RQ 2. run D2RQ-specific script to read a (ArchiveSpace) database schema and create a configuration file 3. run D2RQ with the configuration file 4. provide access via standard linked data publishing methods 5. done If the field names in the initial database are meaningful, and if the database schema is normalized, then D2RQ ought to work pretty well. If many archives use ArchiveSpace, then the field names can become “standard” or at least “best practices”, and the resulting RDF will be well linked. I have downloaded and run ArchiveSpace on my desktop computer. It imported some of my EAD files pretty well. It created EAC-CPF files from my names. Fun. I didn’t see a way to export things as EAD. The whole interface is beautiful and functional. In my copious spare time I will see about configuring ArchiveSpace to use a MySQL backend (instead of the embedded database), and see about putting D2RQ on top. I think this will be easier than learning a new API and building an entire linked data publishing system. D2RQ may be an viable option with the understanding that no solution is perfect. — Eric Morgan
Re: [CODE4LIB] ArchivesSpace v1.0.7 Released [linked data]
Eric, You probably want to do the 1.0.7 full install, which does use a MySQL database. Sound like you've installed just the demo version. Kari Smith -Original Message- From: Code for Libraries [mailto:CODE4LIB@listserv.nd.edu] On Behalf Of Eric Lease Morgan Sent: Thursday, March 06, 2014 10:42 AM To: CODE4LIB@listserv.nd.edu Subject: Re: [CODE4LIB] ArchivesSpace v1.0.7 Released [linked data] On Mar 6, 2014, at 9:47 AM, Mark A. Matienzo mark.matie...@gmail.com wrote: ArchivesSpace has a REST backend API, and requests yield a response in JSON. As one option, I'd investigate to publish linked data as JSON-LD. Some degree of mapping would be necessary, but I imagine it would be significantly easier to that instead of using something like D2RQ. If I understand things correctly, using D2RQ to publish database contents as linked data is mostly a systems administration task: 1. download and install D2RQ 2. run D2RQ-specific script to read a (ArchiveSpace) database schema and create a configuration file 3. run D2RQ with the configuration file 4. provide access via standard linked data publishing methods 5. done If the field names in the initial database are meaningful, and if the database schema is normalized, then D2RQ ought to work pretty well. If many archives use ArchiveSpace, then the field names can become standard or at least best practices, and the resulting RDF will be well linked. I have downloaded and run ArchiveSpace on my desktop computer. It imported some of my EAD files pretty well. It created EAC-CPF files from my names. Fun. I didn't see a way to export things as EAD. The whole interface is beautiful and functional. In my copious spare time I will see about configuring ArchiveSpace to use a MySQL backend (instead of the embedded database), and see about putting D2RQ on top. I think this will be easier than learning a new API and building an entire linked data publishing system. D2RQ may be an viable option with the understanding that no solution is perfect. - Eric Morgan
Re: [CODE4LIB] ArchivesSpace v1.0.7 Released [linked data]
On Thu, Mar 6, 2014 at 11:05 AM, Ethan Gruber ewg4x...@gmail.com wrote: The issue here that I see is that D2RQ will expose the MySQL database structure as linked data in some sort of indecipherable ontology and the end result is probably useless. What Mark alludes to here is that the developers of ArchivesSpace could write scripts, inherent to the platform, that could output linked data that conforms to existing or emerging standards. This is much simpler than introducing D2RQ into the application layer, and allows for greater control of the export models. As a developer of different, potentially competing, software applications for EAD and EAC-CPF publication, who is to say that ArchivesSpace database field names should be standards or best practices? These are things that should be determined by the archival community, not a software application. CIDOC-CRM is capable of representing the structure and relationships between components of an archival collection. I'm not a huge advocate of the CRM because I think it has a tendency to be inordinately complex, but *it* is a standard. Therefore, if the archival community decided that it would adopt CRM as its RDF data model standard, ArchivesSpace, ICA-AtoM, EADitor, and other archival management/description systems could adapt to the needs of the community and offer content in these models. For the sake of consumers of this data who might not be deeply acquainted with archives standards but who are interested in building a high-level aggregation of various sets of available resources (like, say, search engines), it would also be nice to see an attempt at a schema.orgrepresentation, too. Perhaps as RDFa or microdata in the regular web UI. Dan, aka one-trick pony
Re: [CODE4LIB] ArchivesSpace v1.0.7 Released [linked data]
On Thu, Mar 6, 2014 at 11:05 AM, Ethan Gruber ewg4x...@gmail.com wrote: What Mark alludes to here is that the developers of ArchivesSpace could write scripts, inherent to the platform, that could output linked data that conforms to existing or emerging standards. This is much simpler than introducing D2RQ into the application layer, and allows for greater control of the export models. As a developer of different, potentially competing, software applications for EAD and EAC-CPF publication, who is to say that ArchivesSpace database field names should be standards or best practices? These are things that should be determined by the archival community, not a software application. Exactly. I'm also just saying that D2RQ in this case is a bad idea. ArchivesSpace uses an ORM layer, and as such even the database interaction is conveniently abstracted away. ArchivesSpace has an API; leverage the API, not the datastore. Doing the latter in this case is, frankly, a bad idea. Mark
Re: [CODE4LIB] ArchivesSpace v1.0.7 Released [linked data]
Let me ask a more direct question. If participating in linked data is a “good thing”, then how do you — anybody here — suggest archivists (or librarians or museum curators) do that starting today? —Eric Morgan
Re: [CODE4LIB] ArchivesSpace v1.0.7 Released [linked data]
I think that RDFa provides the lowest barrier to entry. Using dcterms for publisher, creator, title, etc. is a good place to start, and if your collection (archival, library, museum) links to terms defined in LOD vocabulary systems (LCSH, Getty, LCNAF, whatever), output these URIs in the HTML interface and tag them in RDFa in such a way that they are semantically meaningful, e.g., a href=http://vocab.getty.edu/aat/300028569; rel=dcterms:formatmanuscripts (document genre)/a It would be great if content management systems supported RDFa right out of the box, and perhaps they are all moving in this direction. But you don't need a content management system to do this. If you generate static HTML files for your finding aids from EAD files using XSLT, you can tweak your XSLT output to handle RDFa. Ethan On Thu, Mar 6, 2014 at 12:56 PM, Eric Lease Morgan emor...@nd.edu wrote: Let me ask a more direct question. If participating in linked data is a “good thing”, then how do you — anybody here — suggest archivists (or librarians or museum curators) do that starting today? —Eric Morgan