Re: [CODE4LIB] linked data recipe
+1 for schema.org as one of the first steps. COinS are another useful simple mark-up if the data is already there. I'm looking forward to the book. Sincerely, David Bigwood Lunar and Planetary Institute -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen Coyle Sent: Tuesday, November 19, 2013 10:10 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] linked data recipe Eric, if you want to leap into the linked data world in the fastest, easiest way possible, then I suggest looking at microdata markup, e.g. schema.org.[1] Schema.org does not require you to transform your data at all: it only requires mark-up of your online displays. This makes sense because as long as your data is in local databases, it's not visible to the linked data universe anyway; so why not take the easy way out and just add linked data to your public online displays? This doesn't require a transformation of your entire record (some of which may not be suitable as linked data in any case), only those "things" that are likely to link usefully. This latter generally means "things for which you have an identifier." And you make no changes to your database, only to display. OCLC is already producing this markup in WorldCat records [2]-- not perfectly, of course, lots of warts, but it is a first step. However, it is a first step that makes more sense to me than *transforming* or *cross-walking* current metadata. It also, I believe, will help us understand what bits of our current metadata will make the transition to linked data, and what bits should remain as accessible documents that users can reach through linked data. kc [1] http://schema.org, and look at the work going on to add bibliographic properties at http://www.w3.org/community/schemabibex/wiki/Main_Page [2] look at the "linked data" section of any WorldCat page for a single item, such ashttp://www.worldcat.org/title/selection-of-early-statistical-papers-of-j-neyman/oclc/527725&referer=brief_results
Re: [CODE4LIB] linked data recipe
Ethan, it looks to me like it depends on who you are and who is your target. In the schema.org clan there is still a majority using microdata, but my impression is that these are the online sales sites whose primary interest is SEO. RDFa lite is moving up generally [0], yet I haven't seen a clear statement that the search engines consider it = microdata (even though the two are very close). Perhaps they do? Recently it was announced that JSON-LD is now an "official" schema.org markup. The advantage of JSON-LD is that it separates the display from the mark-up so there is less of a formatting issue. However, it also opens it all up to scamming - well, to easier scamming than with the other two formats. Meanwhile, as more and more folks discover schema.org there is more and more demand for additions to what was originally an extremely simple set of properties. Some predict that it will crumble under its own disorderliness, a metadata tower of Babel. Regardless of that, I still think that the web is the place for linked data, even though there are quite a few enterprise implementations of ld that do not present a public face. I'd prefer to have some idea of what we want to link to, why, and how it will help users. There are some examples, like FAO's Open Agris [1], but I'd like to see more. (And I'm not sure what LIBRIS [2] is doing with their catalog, which is reported to be a triple-store.) kc [0] http://webdatacommons.org/ [1] http://agris.fao.org/openagris/ [2] http://libris.kb.se/?language=en On 11/19/13 8:28 AM, Ethan Gruber wrote: Hasn't the pendulum swung back toward RDFa Lite ( http://www.w3.org/TR/rdfa-lite/) recently? They are fairly equivalent, but I'm not sure about all the politics involved. On Tue, Nov 19, 2013 at 11:09 AM, Karen Coyle wrote: Eric, if you want to leap into the linked data world in the fastest, easiest way possible, then I suggest looking at microdata markup, e.g. schema.org.[1] Schema.org does not require you to transform your data at all: it only requires mark-up of your online displays. This makes sense because as long as your data is in local databases, it's not visible to the linked data universe anyway; so why not take the easy way out and just add linked data to your public online displays? This doesn't require a transformation of your entire record (some of which may not be suitable as linked data in any case), only those "things" that are likely to link usefully. This latter generally means "things for which you have an identifier." And you make no changes to your database, only to display. OCLC is already producing this markup in WorldCat records [2]-- not perfectly, of course, lots of warts, but it is a first step. However, it is a first step that makes more sense to me than *transforming* or *cross-walking* current metadata. It also, I believe, will help us understand what bits of our current metadata will make the transition to linked data, and what bits should remain as accessible documents that users can reach through linked data. kc [1] http://schema.org, and look at the work going on to add bibliographic properties at http://www.w3.org/community/schemabibex/wiki/Main_Page [2] look at the "linked data" section of any WorldCat page for a single item, such ashttp://www.worldcat.org/title/selection-of-early- statistical-papers-of-j-neyman/oclc/527725&referer=brief_results On 11/19/13 7:54 AM, Eric Lease Morgan wrote: On Nov 19, 2013, at 9:41 AM, Karen Coyle wrote: Eric, I think this skips a step - which is the design step in which you create a domain model that uses linked data as its basis. RDF is not a serialization; it actually may require you to re-think the basic structure of your metadata. The reason for that is that it provides capabilities that record-based data models do not. Rather than starting with current metadata, you need to take a step back and ask: what does my information world look like as linked data? I respectfully disagree. I do not think it necessary to create a domain model ahead of time; I do not think it is necessary for us to re-think our metadata structures. There already exists tools enabling us — cultural heritage institutions — to manifest our metadata as RDF. The manifestations may not be perfect, but “we need to learn to walk before we run” and the metadata structures we have right now will work for right now. As we mature we can refine our processes. I do not advocate “stepping back and asking”. I advocate looking forward and doing. —Eric Morgan -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] linked data recipe
I think this is a nice list Eric. I particularly like the iterative approach. I’m not a huge fan of #6, and #7 seems like it might be challenging from a data synchronization perspective. But it’s still a nice list. While I think it’s right that you don’t want to let the perfect (a complete and perfect domain model) be the enemy of the good (iterative data publishing on the Web), it definitely helps if a Linked Data project has an idea of what types of resources it is putting on the Web, and how they potentially fit in with other stuff that’s already there. I honestly think the hardest thing is to establish *why* you want to publish data on the Web: who is it for, how will they use it, etc. If the honest answer is simply “it is the right thing to do”, “we want to get a grant” or “we want to build the semweb” that’s fine, but it’s not ideal. Ideally there’s an actual use case where exposing structured data on the Web yields potential benefits, that can be realized with Linked Data. //Ed On Nov 19, 2013, at 11:55 AM, Eric Lease Morgan wrote: > On Nov 19, 2013, at 11:09 AM, Karen Coyle wrote: > >> Eric, if you want to leap into the linked data world in the fastest, >> easiest way possible, then I suggest looking at microdata markup, e.g. >> schema.org. [1] … >> >> [1] http://schema.org > > > I don’t advocate this as the fastest, easiest way possible because it forces > RDF “aggregators” to parse HTML, and thus passes a level of complexity down > the processing chain. Expose RDF as RDF, not embedded in another format. I do > advocate the inclusion of schema.org mark-up, RDFa, etc. into HTML but rather > as a level of refinement. —Eric Morgan
Re: [CODE4LIB] linked data recipe
On Nov 19, 2013, at 11:09 AM, Karen Coyle wrote: > Eric, if you want to leap into the linked data world in the fastest, > easiest way possible, then I suggest looking at microdata markup, e.g. > schema.org. [1] … > > [1] http://schema.org I don’t advocate this as the fastest, easiest way possible because it forces RDF “aggregators” to parse HTML, and thus passes a level of complexity down the processing chain. Expose RDF as RDF, not embedded in another format. I do advocate the inclusion of schema.org mark-up, RDFa, etc. into HTML but rather as a level of refinement. —Eric Morgan
Re: [CODE4LIB] linked data recipe
On Nov 19, 2013, at 9:54 AM, Aaron Rubinstein wrote: > I think you’ve hit the nail on the head here, Karen. I would just > add, or maybe reassure, that this does not necessarily require > rethinking your existing metadata but how to translate that > ^ > existing metadata into a linked data environment. Though this > > might seem like a pain, in many cases it will actually inspire > you to go back and improve/increase the value of that existing > metadata... There are tools allowing people to translate existing metadata into a linked data environment, and for right now, I advocate that they are good enough. I will provide simplistic examples. For people who maintain MARC records: 1. convert the MARC records to MARCXML with the MARCXML Toolkit [1] 2. convert the MARCXML to RDF/XML in the manner of BIBFRAME’s transformation service [2] 3. save the resulting RDF/XML on a Web server 4. convert the MARC (or MARCXML) into (valid) HTML 5. save the resulting HTML on a Web server 6. for extra credit, implement a content negotiation service for the HTML and RDF/XML 7. for extra extra credit, implement a SPARQL endpoint for your RDF If one does Steps #1 through #5, then they are doing linked data and participating in the Semantic Web. That is the goal. For people who maintain EAD files: 1. transform the EAD files into RDF/XML with a stylesheet created by the Archives Hub [3] 2. save the resulting RDF/XML on a Web server 3. transform the EAD into HTML, using your favorite EAD to HTML stylesheet [4] 4. save the resulting HTML on a Web server 5. for extra credit, implement a content negotiation service for the HTML and RDF/XML 6. for extra extra credit, implement a SPARQL endpoint for your RDF If one does Steps #1 through #4 of this example, then they are doing linked data and participating in the Semantic Web. That is the goal. In both examples the end result will be a valid linked data implementation. Not complete. Not necessarily as thorough as desired. Not necessarily as accurate as desired. But valid. Such a process will not expose false, incorrect data/information, but rather data/information that is intended to be maintained, improved, and updated on a continual basis. Finally, I want to highlight a distinction between well-formed, valid, and accurate information — linked data. I will use XML as an example. XML can be “well-formed”. This means it is syntactically correct. Specific characters are represented by entities. Elements are correctly opened and closed. The whole structure has a single root. Etc. The next level up is “valid”. Valid XML is XML that conforms to a DTD or schema; it is semantically correct. It means that required elements exist, and are presented in a particular order. Specific attributes used in elements are denoted. And in the case of schemas, values in elements and attributes take on particular shapes beyond simple character data. Finally XML can be “accurate” (my term). This means the assertions in the XML are true. For example, there is nothing stopping me from putting the title of a work in an author element. How is the computer expected to know the difference? It can’t. Alternatively, the title could be presente! d as “Thee Adventrs Av Tom Sawher”, when the more accurate title may be “The Adventures of Tom Sawyer”. Well-formedness and validity is the domain of computers. Accuracy is the domain of humans. In the world of linked data, you are not participating if your published data is not “well-formed”. (Go back to start.) You are participating if it is “valid”. But you are really doing really well if the data is “accurate”. Let’s not make this more difficult than it really is. [1] MARCXML Toolkit - linked at http://www.loc.gov/standards/marcxml/ [2] BIBFRAME’s transformation service - http://bibframe.org/tools/transform/start [3] Archives Hub stylesheet - http://data.archiveshub.ac.uk/xslt/ead2rdf.xsl [4] EAD to HTML - for example, http://www.catholicresearch.net/data/ead/ead2html.xsl — Eric Morgan
Re: [CODE4LIB] linked data recipe
Hasn't the pendulum swung back toward RDFa Lite ( http://www.w3.org/TR/rdfa-lite/) recently? They are fairly equivalent, but I'm not sure about all the politics involved. On Tue, Nov 19, 2013 at 11:09 AM, Karen Coyle wrote: > Eric, if you want to leap into the linked data world in the fastest, > easiest way possible, then I suggest looking at microdata markup, e.g. > schema.org.[1] Schema.org does not require you to transform your data at > all: it only requires mark-up of your online displays. This makes sense > because as long as your data is in local databases, it's not visible to the > linked data universe anyway; so why not take the easy way out and just add > linked data to your public online displays? This doesn't require a > transformation of your entire record (some of which may not be suitable as > linked data in any case), only those "things" that are likely to link > usefully. This latter generally means "things for which you have an > identifier." And you make no changes to your database, only to display. > > OCLC is already producing this markup in WorldCat records [2]-- not > perfectly, of course, lots of warts, but it is a first step. However, it is > a first step that makes more sense to me than *transforming* or > *cross-walking* current metadata. It also, I believe, will help us > understand what bits of our current metadata will make the transition to > linked data, and what bits should remain as accessible documents that users > can reach through linked data. > > kc > [1] http://schema.org, and look at the work going on to add bibliographic > properties at http://www.w3.org/community/schemabibex/wiki/Main_Page > [2] look at the "linked data" section of any WorldCat page for a single > item, such ashttp://www.worldcat.org/title/selection-of-early- > statistical-papers-of-j-neyman/oclc/527725&referer=brief_results > > > > > On 11/19/13 7:54 AM, Eric Lease Morgan wrote: > >> On Nov 19, 2013, at 9:41 AM, Karen Coyle wrote: >> >> Eric, I think this skips a step - which is the design step in which you >>> create a domain model that uses linked data as its basis. RDF is not a >>> serialization; it actually may require you to re-think the basic >>> structure of your metadata. The reason for that is that it provides >>> capabilities that record-based data models do not. Rather than starting >>> with current metadata, you need to take a step back and ask: what does >>> my information world look like as linked data? >>> >> >> I respectfully disagree. I do not think it necessary to create a domain >> model ahead of time; I do not think it is necessary for us to re-think our >> metadata structures. There already exists tools enabling us — cultural >> heritage institutions — to manifest our metadata as RDF. The manifestations >> may not be perfect, but “we need to learn to walk before we run” and the >> metadata structures we have right now will work for right now. As we mature >> we can refine our processes. I do not advocate “stepping back and asking”. >> I advocate looking forward and doing. —Eric Morgan >> > > -- > Karen Coyle > kco...@kcoyle.net http://kcoyle.net > m: 1-510-435-8234 > skype: kcoylenet >
Re: [CODE4LIB] linked data recipe
Eric, if you want to leap into the linked data world in the fastest, easiest way possible, then I suggest looking at microdata markup, e.g. schema.org.[1] Schema.org does not require you to transform your data at all: it only requires mark-up of your online displays. This makes sense because as long as your data is in local databases, it's not visible to the linked data universe anyway; so why not take the easy way out and just add linked data to your public online displays? This doesn't require a transformation of your entire record (some of which may not be suitable as linked data in any case), only those "things" that are likely to link usefully. This latter generally means "things for which you have an identifier." And you make no changes to your database, only to display. OCLC is already producing this markup in WorldCat records [2]-- not perfectly, of course, lots of warts, but it is a first step. However, it is a first step that makes more sense to me than *transforming* or *cross-walking* current metadata. It also, I believe, will help us understand what bits of our current metadata will make the transition to linked data, and what bits should remain as accessible documents that users can reach through linked data. kc [1] http://schema.org, and look at the work going on to add bibliographic properties at http://www.w3.org/community/schemabibex/wiki/Main_Page [2] look at the "linked data" section of any WorldCat page for a single item, such ashttp://www.worldcat.org/title/selection-of-early-statistical-papers-of-j-neyman/oclc/527725&referer=brief_results On 11/19/13 7:54 AM, Eric Lease Morgan wrote: On Nov 19, 2013, at 9:41 AM, Karen Coyle wrote: Eric, I think this skips a step - which is the design step in which you create a domain model that uses linked data as its basis. RDF is not a serialization; it actually may require you to re-think the basic structure of your metadata. The reason for that is that it provides capabilities that record-based data models do not. Rather than starting with current metadata, you need to take a step back and ask: what does my information world look like as linked data? I respectfully disagree. I do not think it necessary to create a domain model ahead of time; I do not think it is necessary for us to re-think our metadata structures. There already exists tools enabling us — cultural heritage institutions — to manifest our metadata as RDF. The manifestations may not be perfect, but “we need to learn to walk before we run” and the metadata structures we have right now will work for right now. As we mature we can refine our processes. I do not advocate “stepping back and asking”. I advocate looking forward and doing. —Eric Morgan -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] linked data recipe
yo, i get it On Tue, Nov 19, 2013 at 10:54 AM, Ross Singer wrote: > I don't know what your definition of "serialization" is, but I don't know > of any where "data model" and "formatted output of a data model" are > synonymous. > > RDF is a data model *not* a serialization. > > -Ross. > > > On Tue, Nov 19, 2013 at 10:45 AM, Ethan Gruber wrote: > > > I see that serialization has a different definition in computer science > > than I thought it did. > > > > > > On Tue, Nov 19, 2013 at 10:36 AM, Ross Singer > > wrote: > > > > > That's still not a "serialization". It's just a similar data model. > > > Pretty huge difference. > > > > > > -Ross. > > > > > > > > > On Tue, Nov 19, 2013 at 10:31 AM, Ethan Gruber > > wrote: > > > > > > > I'm not sure that I agree that RDF is not a serialization. It really > > > > depends on the context of the system and intended use of the linked > > data. > > > > For example, TEI is designed with a specific purpose which cannot be > > > > replicated in RDF (at least, not very easily at all), but deriving > RDF > > > from > > > > highly-linked TEI to put into an endpoint can open doors to queries > > which > > > > are otherwise impossible to make on the data. This certainly > requires > > > some > > > > rethinking of the way texts interact. But perhaps it may be best to > > say > > > > that RDF *can* (but not necessarily) be a derivation, rather than > > > > serialization, of some larger, more complex canonical data model. > > > > > > > > Ethan > > > > > > > > > > > > On Tue, Nov 19, 2013 at 9:54 AM, Aaron Rubinstein < > > > > arubi...@library.umass.edu> wrote: > > > > > > > > > I think you’ve hit the nail on the head here, Karen. I would just > > add, > > > or > > > > > maybe reassure, that this does not necessarily require rethinking > > your > > > > > existing metadata but how to translate that existing metadata into > a > > > > linked > > > > > data environment. Though this might seem like a pain, in many cases > > it > > > > will > > > > > actually inspire you to go back and improve/increase the value of > > that > > > > > existing metadata. > > > > > > > > > > This is definitely looking awesome, Eric! > > > > > > > > > > Aaron > > > > > > > > > > On Nov 19, 2013, at 9:41 AM, Karen Coyle wrote: > > > > > > > > > > > Eric, I think this skips a step - which is the design step in > which > > > you > > > > > create a domain model that uses linked data as its basis. RDF is > not > > a > > > > > serialization; it actually may require you to re-think the basic > > > > structure > > > > > of your metadata. The reason for that is that it provides > > capabilities > > > > that > > > > > record-based data models do not. Rather than starting with current > > > > > metadata, you need to take a step back and ask: what does my > > > information > > > > > world look like as linked data? > > > > > > > > > > > > I repeat: RDF is NOT A SERIALIZATION. > > > > > > > > > > > > kc > > > > > > > > > > > > On 11/19/13 5:04 AM, Eric Lease Morgan wrote: > > > > > >> I believe participating in the Semantic Web and providing > content > > > via > > > > > the principles of linked data is not "rocket surgery", especially > for > > > > > cultural heritage institutions -- libraries, archives, and museums. > > > Here > > > > is > > > > > a simple recipe for their participation: > > > > > >> > > > > > >> 1. use existing metadata standards (MARC, EAD, etc.) to > describe > > > > > >> collections > > > > > >> > > > > > >> 2. use any number of existing tools to convert the metadata to > > > > > >> HTML, and save the HTML on a Web server > > > > > >> > > > > > >> 3. use any number of existing tools to convert the metadata to > > > > > >> RDF/XML (or some other "serialization" of RDF), and save > the > > > > > >> RDF/XML on a Web server > > > > > >> > > > > > >> 4. rest, congratulate yourself, and share your experience with > > > > > >> others in your domain > > > > > >> > > > > > >> 5. after the first time though, go back to Step #1, but this > > time > > > > > >> work with other people inside your domain making sure you > use > > > as > > > > > >> many of the same URIs as possible > > > > > >> > > > > > >> 6. after the second time through, go back to Step #1, but this > > > > > >> time supplement access to your linked data with a triple > > store, > > > > > >> thus supporting search > > > > > >> > > > > > >> 7. after the third time through, go back to Step #1, but this > > > > > >> time use any number of existing tools to expose the content > > in > > > > > >> your other information systems (relational databases, > OAI-PMH > > > > > >> data repositories, etc.) > > > > > >> > > > > > >> 8. for dessert, cogitate ways to exploit the linked data in > your > > > > > >> domain to discover new and additional relationships between > > > URIs, > > > > > >> and thus make the Semantic Web more of a reality > > > > > >> > > > > > >
Re: [CODE4LIB] linked data recipe
I don't know what your definition of "serialization" is, but I don't know of any where "data model" and "formatted output of a data model" are synonymous. RDF is a data model *not* a serialization. -Ross. On Tue, Nov 19, 2013 at 10:45 AM, Ethan Gruber wrote: > I see that serialization has a different definition in computer science > than I thought it did. > > > On Tue, Nov 19, 2013 at 10:36 AM, Ross Singer > wrote: > > > That's still not a "serialization". It's just a similar data model. > > Pretty huge difference. > > > > -Ross. > > > > > > On Tue, Nov 19, 2013 at 10:31 AM, Ethan Gruber > wrote: > > > > > I'm not sure that I agree that RDF is not a serialization. It really > > > depends on the context of the system and intended use of the linked > data. > > > For example, TEI is designed with a specific purpose which cannot be > > > replicated in RDF (at least, not very easily at all), but deriving RDF > > from > > > highly-linked TEI to put into an endpoint can open doors to queries > which > > > are otherwise impossible to make on the data. This certainly requires > > some > > > rethinking of the way texts interact. But perhaps it may be best to > say > > > that RDF *can* (but not necessarily) be a derivation, rather than > > > serialization, of some larger, more complex canonical data model. > > > > > > Ethan > > > > > > > > > On Tue, Nov 19, 2013 at 9:54 AM, Aaron Rubinstein < > > > arubi...@library.umass.edu> wrote: > > > > > > > I think you’ve hit the nail on the head here, Karen. I would just > add, > > or > > > > maybe reassure, that this does not necessarily require rethinking > your > > > > existing metadata but how to translate that existing metadata into a > > > linked > > > > data environment. Though this might seem like a pain, in many cases > it > > > will > > > > actually inspire you to go back and improve/increase the value of > that > > > > existing metadata. > > > > > > > > This is definitely looking awesome, Eric! > > > > > > > > Aaron > > > > > > > > On Nov 19, 2013, at 9:41 AM, Karen Coyle wrote: > > > > > > > > > Eric, I think this skips a step - which is the design step in which > > you > > > > create a domain model that uses linked data as its basis. RDF is not > a > > > > serialization; it actually may require you to re-think the basic > > > structure > > > > of your metadata. The reason for that is that it provides > capabilities > > > that > > > > record-based data models do not. Rather than starting with current > > > > metadata, you need to take a step back and ask: what does my > > information > > > > world look like as linked data? > > > > > > > > > > I repeat: RDF is NOT A SERIALIZATION. > > > > > > > > > > kc > > > > > > > > > > On 11/19/13 5:04 AM, Eric Lease Morgan wrote: > > > > >> I believe participating in the Semantic Web and providing content > > via > > > > the principles of linked data is not "rocket surgery", especially for > > > > cultural heritage institutions -- libraries, archives, and museums. > > Here > > > is > > > > a simple recipe for their participation: > > > > >> > > > > >> 1. use existing metadata standards (MARC, EAD, etc.) to describe > > > > >> collections > > > > >> > > > > >> 2. use any number of existing tools to convert the metadata to > > > > >> HTML, and save the HTML on a Web server > > > > >> > > > > >> 3. use any number of existing tools to convert the metadata to > > > > >> RDF/XML (or some other "serialization" of RDF), and save the > > > > >> RDF/XML on a Web server > > > > >> > > > > >> 4. rest, congratulate yourself, and share your experience with > > > > >> others in your domain > > > > >> > > > > >> 5. after the first time though, go back to Step #1, but this > time > > > > >> work with other people inside your domain making sure you use > > as > > > > >> many of the same URIs as possible > > > > >> > > > > >> 6. after the second time through, go back to Step #1, but this > > > > >> time supplement access to your linked data with a triple > store, > > > > >> thus supporting search > > > > >> > > > > >> 7. after the third time through, go back to Step #1, but this > > > > >> time use any number of existing tools to expose the content > in > > > > >> your other information systems (relational databases, OAI-PMH > > > > >> data repositories, etc.) > > > > >> > > > > >> 8. for dessert, cogitate ways to exploit the linked data in your > > > > >> domain to discover new and additional relationships between > > URIs, > > > > >> and thus make the Semantic Web more of a reality > > > > >> > > > > >> What do you think? > > > > >> > > > > >> I am in the process of writing a guidebook on the topic of linked > > data > > > > and archives. In the guidebook I will elaborate on this recipe and > > > provide > > > > instructions for its implementation. [1] > > > > >> > > > > >> [1] guidebook - http://sites.tufts.edu/liam/ > > > > >> > > > > >> -- > >
Re: [CODE4LIB] linked data recipe
On Nov 19, 2013, at 9:41 AM, Karen Coyle wrote: > Eric, I think this skips a step - which is the design step in which you > create a domain model that uses linked data as its basis. RDF is not a > serialization; it actually may require you to re-think the basic > structure of your metadata. The reason for that is that it provides > capabilities that record-based data models do not. Rather than starting > with current metadata, you need to take a step back and ask: what does > my information world look like as linked data? I respectfully disagree. I do not think it necessary to create a domain model ahead of time; I do not think it is necessary for us to re-think our metadata structures. There already exists tools enabling us — cultural heritage institutions — to manifest our metadata as RDF. The manifestations may not be perfect, but “we need to learn to walk before we run” and the metadata structures we have right now will work for right now. As we mature we can refine our processes. I do not advocate “stepping back and asking”. I advocate looking forward and doing. —Eric Morgan
Re: [CODE4LIB] linked data recipe
On Nov 19, 2013, at 8:48 AM, Robert Forkel wrote: > while I also think this is not rocket surgery, I'd like to point out that > trial (and potentially error) as suggested by your "go back to step #1" > instructions is not a good solution to coming up with URIs. I think once > published - i.e. put on a webserver - you should be able to keep the URIs > in your RDF persistent. Otherwise you are polluting the Semantic Web with > dead links and make it hard for aggregators to find out whether the data > they harvested is still valid. > > So while iterative approaches are pragmatic and often work out well, for > the particular issue of coming up with URIs I'd recommend spending as much > thought before publishing as you can spend. Intellectually, I completely understand. Practically, I still advocate putting publishing the linked data as soon as possible. Knowledge is refined over time. The data being published is not incorrect nor invalid, just not as good as it could be. Data aggregators will refresh their stores and old information will go to "Big Byte Heaven”. It is just like a library collection. The “best” books are collected. The good ones get used. The old ones get weeded or relegated to off-site storage. What remains is a current perception of truth. Building library collections is a process that is never done nor never perfect. Linked data is a literal reflection of library collections, therefore linked data is never done nor never perfect either. URIs will break. Books will be removed from the collection. URIs will go stale. The process of providing linked data is a lot like painting a painting. The painting is painted as a whole, from start to finish. One does not get one corner of the canvass perfect and move on from there. An idea is articulated. An outlined is drawn. The outline is refined, and the painting gradually comes to life. Many times paintings are never finished but worked, reworked, and worked some more. If the profession looks to make perfect its list of URIs, then it will never leave the starting gate. I know that is not being advocated, but since one can not measure the timeless validity of a URI, I advocate that the current URIs are good enough. There is an understanding of a commitment to updating them and refining them in the future. — Eric Morgan
Re: [CODE4LIB] linked data recipe
I see that serialization has a different definition in computer science than I thought it did. On Tue, Nov 19, 2013 at 10:36 AM, Ross Singer wrote: > That's still not a "serialization". It's just a similar data model. > Pretty huge difference. > > -Ross. > > > On Tue, Nov 19, 2013 at 10:31 AM, Ethan Gruber wrote: > > > I'm not sure that I agree that RDF is not a serialization. It really > > depends on the context of the system and intended use of the linked data. > > For example, TEI is designed with a specific purpose which cannot be > > replicated in RDF (at least, not very easily at all), but deriving RDF > from > > highly-linked TEI to put into an endpoint can open doors to queries which > > are otherwise impossible to make on the data. This certainly requires > some > > rethinking of the way texts interact. But perhaps it may be best to say > > that RDF *can* (but not necessarily) be a derivation, rather than > > serialization, of some larger, more complex canonical data model. > > > > Ethan > > > > > > On Tue, Nov 19, 2013 at 9:54 AM, Aaron Rubinstein < > > arubi...@library.umass.edu> wrote: > > > > > I think you’ve hit the nail on the head here, Karen. I would just add, > or > > > maybe reassure, that this does not necessarily require rethinking your > > > existing metadata but how to translate that existing metadata into a > > linked > > > data environment. Though this might seem like a pain, in many cases it > > will > > > actually inspire you to go back and improve/increase the value of that > > > existing metadata. > > > > > > This is definitely looking awesome, Eric! > > > > > > Aaron > > > > > > On Nov 19, 2013, at 9:41 AM, Karen Coyle wrote: > > > > > > > Eric, I think this skips a step - which is the design step in which > you > > > create a domain model that uses linked data as its basis. RDF is not a > > > serialization; it actually may require you to re-think the basic > > structure > > > of your metadata. The reason for that is that it provides capabilities > > that > > > record-based data models do not. Rather than starting with current > > > metadata, you need to take a step back and ask: what does my > information > > > world look like as linked data? > > > > > > > > I repeat: RDF is NOT A SERIALIZATION. > > > > > > > > kc > > > > > > > > On 11/19/13 5:04 AM, Eric Lease Morgan wrote: > > > >> I believe participating in the Semantic Web and providing content > via > > > the principles of linked data is not "rocket surgery", especially for > > > cultural heritage institutions -- libraries, archives, and museums. > Here > > is > > > a simple recipe for their participation: > > > >> > > > >> 1. use existing metadata standards (MARC, EAD, etc.) to describe > > > >> collections > > > >> > > > >> 2. use any number of existing tools to convert the metadata to > > > >> HTML, and save the HTML on a Web server > > > >> > > > >> 3. use any number of existing tools to convert the metadata to > > > >> RDF/XML (or some other "serialization" of RDF), and save the > > > >> RDF/XML on a Web server > > > >> > > > >> 4. rest, congratulate yourself, and share your experience with > > > >> others in your domain > > > >> > > > >> 5. after the first time though, go back to Step #1, but this time > > > >> work with other people inside your domain making sure you use > as > > > >> many of the same URIs as possible > > > >> > > > >> 6. after the second time through, go back to Step #1, but this > > > >> time supplement access to your linked data with a triple store, > > > >> thus supporting search > > > >> > > > >> 7. after the third time through, go back to Step #1, but this > > > >> time use any number of existing tools to expose the content in > > > >> your other information systems (relational databases, OAI-PMH > > > >> data repositories, etc.) > > > >> > > > >> 8. for dessert, cogitate ways to exploit the linked data in your > > > >> domain to discover new and additional relationships between > URIs, > > > >> and thus make the Semantic Web more of a reality > > > >> > > > >> What do you think? > > > >> > > > >> I am in the process of writing a guidebook on the topic of linked > data > > > and archives. In the guidebook I will elaborate on this recipe and > > provide > > > instructions for its implementation. [1] > > > >> > > > >> [1] guidebook - http://sites.tufts.edu/liam/ > > > >> > > > >> -- > > > >> Eric Lease Morgan > > > >> University of Notre Dame > > > > > > > > -- > > > > Karen Coyle > > > > kco...@kcoyle.net http://kcoyle.net > > > > m: 1-510-435-8234 > > > > skype: kcoylenet > > > > > >
Re: [CODE4LIB] linked data recipe
That's still not a "serialization". It's just a similar data model. Pretty huge difference. -Ross. On Tue, Nov 19, 2013 at 10:31 AM, Ethan Gruber wrote: > I'm not sure that I agree that RDF is not a serialization. It really > depends on the context of the system and intended use of the linked data. > For example, TEI is designed with a specific purpose which cannot be > replicated in RDF (at least, not very easily at all), but deriving RDF from > highly-linked TEI to put into an endpoint can open doors to queries which > are otherwise impossible to make on the data. This certainly requires some > rethinking of the way texts interact. But perhaps it may be best to say > that RDF *can* (but not necessarily) be a derivation, rather than > serialization, of some larger, more complex canonical data model. > > Ethan > > > On Tue, Nov 19, 2013 at 9:54 AM, Aaron Rubinstein < > arubi...@library.umass.edu> wrote: > > > I think you’ve hit the nail on the head here, Karen. I would just add, or > > maybe reassure, that this does not necessarily require rethinking your > > existing metadata but how to translate that existing metadata into a > linked > > data environment. Though this might seem like a pain, in many cases it > will > > actually inspire you to go back and improve/increase the value of that > > existing metadata. > > > > This is definitely looking awesome, Eric! > > > > Aaron > > > > On Nov 19, 2013, at 9:41 AM, Karen Coyle wrote: > > > > > Eric, I think this skips a step - which is the design step in which you > > create a domain model that uses linked data as its basis. RDF is not a > > serialization; it actually may require you to re-think the basic > structure > > of your metadata. The reason for that is that it provides capabilities > that > > record-based data models do not. Rather than starting with current > > metadata, you need to take a step back and ask: what does my information > > world look like as linked data? > > > > > > I repeat: RDF is NOT A SERIALIZATION. > > > > > > kc > > > > > > On 11/19/13 5:04 AM, Eric Lease Morgan wrote: > > >> I believe participating in the Semantic Web and providing content via > > the principles of linked data is not "rocket surgery", especially for > > cultural heritage institutions -- libraries, archives, and museums. Here > is > > a simple recipe for their participation: > > >> > > >> 1. use existing metadata standards (MARC, EAD, etc.) to describe > > >> collections > > >> > > >> 2. use any number of existing tools to convert the metadata to > > >> HTML, and save the HTML on a Web server > > >> > > >> 3. use any number of existing tools to convert the metadata to > > >> RDF/XML (or some other "serialization" of RDF), and save the > > >> RDF/XML on a Web server > > >> > > >> 4. rest, congratulate yourself, and share your experience with > > >> others in your domain > > >> > > >> 5. after the first time though, go back to Step #1, but this time > > >> work with other people inside your domain making sure you use as > > >> many of the same URIs as possible > > >> > > >> 6. after the second time through, go back to Step #1, but this > > >> time supplement access to your linked data with a triple store, > > >> thus supporting search > > >> > > >> 7. after the third time through, go back to Step #1, but this > > >> time use any number of existing tools to expose the content in > > >> your other information systems (relational databases, OAI-PMH > > >> data repositories, etc.) > > >> > > >> 8. for dessert, cogitate ways to exploit the linked data in your > > >> domain to discover new and additional relationships between URIs, > > >> and thus make the Semantic Web more of a reality > > >> > > >> What do you think? > > >> > > >> I am in the process of writing a guidebook on the topic of linked data > > and archives. In the guidebook I will elaborate on this recipe and > provide > > instructions for its implementation. [1] > > >> > > >> [1] guidebook - http://sites.tufts.edu/liam/ > > >> > > >> -- > > >> Eric Lease Morgan > > >> University of Notre Dame > > > > > > -- > > > Karen Coyle > > > kco...@kcoyle.net http://kcoyle.net > > > m: 1-510-435-8234 > > > skype: kcoylenet > > >
Re: [CODE4LIB] linked data recipe
I'm not sure that I agree that RDF is not a serialization. It really depends on the context of the system and intended use of the linked data. For example, TEI is designed with a specific purpose which cannot be replicated in RDF (at least, not very easily at all), but deriving RDF from highly-linked TEI to put into an endpoint can open doors to queries which are otherwise impossible to make on the data. This certainly requires some rethinking of the way texts interact. But perhaps it may be best to say that RDF *can* (but not necessarily) be a derivation, rather than serialization, of some larger, more complex canonical data model. Ethan On Tue, Nov 19, 2013 at 9:54 AM, Aaron Rubinstein < arubi...@library.umass.edu> wrote: > I think you’ve hit the nail on the head here, Karen. I would just add, or > maybe reassure, that this does not necessarily require rethinking your > existing metadata but how to translate that existing metadata into a linked > data environment. Though this might seem like a pain, in many cases it will > actually inspire you to go back and improve/increase the value of that > existing metadata. > > This is definitely looking awesome, Eric! > > Aaron > > On Nov 19, 2013, at 9:41 AM, Karen Coyle wrote: > > > Eric, I think this skips a step - which is the design step in which you > create a domain model that uses linked data as its basis. RDF is not a > serialization; it actually may require you to re-think the basic structure > of your metadata. The reason for that is that it provides capabilities that > record-based data models do not. Rather than starting with current > metadata, you need to take a step back and ask: what does my information > world look like as linked data? > > > > I repeat: RDF is NOT A SERIALIZATION. > > > > kc > > > > On 11/19/13 5:04 AM, Eric Lease Morgan wrote: > >> I believe participating in the Semantic Web and providing content via > the principles of linked data is not "rocket surgery", especially for > cultural heritage institutions -- libraries, archives, and museums. Here is > a simple recipe for their participation: > >> > >> 1. use existing metadata standards (MARC, EAD, etc.) to describe > >> collections > >> > >> 2. use any number of existing tools to convert the metadata to > >> HTML, and save the HTML on a Web server > >> > >> 3. use any number of existing tools to convert the metadata to > >> RDF/XML (or some other "serialization" of RDF), and save the > >> RDF/XML on a Web server > >> > >> 4. rest, congratulate yourself, and share your experience with > >> others in your domain > >> > >> 5. after the first time though, go back to Step #1, but this time > >> work with other people inside your domain making sure you use as > >> many of the same URIs as possible > >> > >> 6. after the second time through, go back to Step #1, but this > >> time supplement access to your linked data with a triple store, > >> thus supporting search > >> > >> 7. after the third time through, go back to Step #1, but this > >> time use any number of existing tools to expose the content in > >> your other information systems (relational databases, OAI-PMH > >> data repositories, etc.) > >> > >> 8. for dessert, cogitate ways to exploit the linked data in your > >> domain to discover new and additional relationships between URIs, > >> and thus make the Semantic Web more of a reality > >> > >> What do you think? > >> > >> I am in the process of writing a guidebook on the topic of linked data > and archives. In the guidebook I will elaborate on this recipe and provide > instructions for its implementation. [1] > >> > >> [1] guidebook - http://sites.tufts.edu/liam/ > >> > >> -- > >> Eric Lease Morgan > >> University of Notre Dame > > > > -- > > Karen Coyle > > kco...@kcoyle.net http://kcoyle.net > > m: 1-510-435-8234 > > skype: kcoylenet >
Re: [CODE4LIB] linked data recipe
I think you’ve hit the nail on the head here, Karen. I would just add, or maybe reassure, that this does not necessarily require rethinking your existing metadata but how to translate that existing metadata into a linked data environment. Though this might seem like a pain, in many cases it will actually inspire you to go back and improve/increase the value of that existing metadata. This is definitely looking awesome, Eric! Aaron On Nov 19, 2013, at 9:41 AM, Karen Coyle wrote: > Eric, I think this skips a step - which is the design step in which you > create a domain model that uses linked data as its basis. RDF is not a > serialization; it actually may require you to re-think the basic structure of > your metadata. The reason for that is that it provides capabilities that > record-based data models do not. Rather than starting with current metadata, > you need to take a step back and ask: what does my information world look > like as linked data? > > I repeat: RDF is NOT A SERIALIZATION. > > kc > > On 11/19/13 5:04 AM, Eric Lease Morgan wrote: >> I believe participating in the Semantic Web and providing content via the >> principles of linked data is not "rocket surgery", especially for cultural >> heritage institutions -- libraries, archives, and museums. Here is a simple >> recipe for their participation: >> >> 1. use existing metadata standards (MARC, EAD, etc.) to describe >> collections >> >> 2. use any number of existing tools to convert the metadata to >> HTML, and save the HTML on a Web server >> >> 3. use any number of existing tools to convert the metadata to >> RDF/XML (or some other "serialization" of RDF), and save the >> RDF/XML on a Web server >> >> 4. rest, congratulate yourself, and share your experience with >> others in your domain >> >> 5. after the first time though, go back to Step #1, but this time >> work with other people inside your domain making sure you use as >> many of the same URIs as possible >> >> 6. after the second time through, go back to Step #1, but this >> time supplement access to your linked data with a triple store, >> thus supporting search >> >> 7. after the third time through, go back to Step #1, but this >> time use any number of existing tools to expose the content in >> your other information systems (relational databases, OAI-PMH >> data repositories, etc.) >> >> 8. for dessert, cogitate ways to exploit the linked data in your >> domain to discover new and additional relationships between URIs, >> and thus make the Semantic Web more of a reality >> >> What do you think? >> >> I am in the process of writing a guidebook on the topic of linked data and >> archives. In the guidebook I will elaborate on this recipe and provide >> instructions for its implementation. [1] >> >> [1] guidebook - http://sites.tufts.edu/liam/ >> >> -- >> Eric Lease Morgan >> University of Notre Dame > > -- > Karen Coyle > kco...@kcoyle.net http://kcoyle.net > m: 1-510-435-8234 > skype: kcoylenet
Re: [CODE4LIB] linked data recipe
Eric, I think this skips a step - which is the design step in which you create a domain model that uses linked data as its basis. RDF is not a serialization; it actually may require you to re-think the basic structure of your metadata. The reason for that is that it provides capabilities that record-based data models do not. Rather than starting with current metadata, you need to take a step back and ask: what does my information world look like as linked data? I repeat: RDF is NOT A SERIALIZATION. kc On 11/19/13 5:04 AM, Eric Lease Morgan wrote: I believe participating in the Semantic Web and providing content via the principles of linked data is not "rocket surgery", especially for cultural heritage institutions -- libraries, archives, and museums. Here is a simple recipe for their participation: 1. use existing metadata standards (MARC, EAD, etc.) to describe collections 2. use any number of existing tools to convert the metadata to HTML, and save the HTML on a Web server 3. use any number of existing tools to convert the metadata to RDF/XML (or some other "serialization" of RDF), and save the RDF/XML on a Web server 4. rest, congratulate yourself, and share your experience with others in your domain 5. after the first time though, go back to Step #1, but this time work with other people inside your domain making sure you use as many of the same URIs as possible 6. after the second time through, go back to Step #1, but this time supplement access to your linked data with a triple store, thus supporting search 7. after the third time through, go back to Step #1, but this time use any number of existing tools to expose the content in your other information systems (relational databases, OAI-PMH data repositories, etc.) 8. for dessert, cogitate ways to exploit the linked data in your domain to discover new and additional relationships between URIs, and thus make the Semantic Web more of a reality What do you think? I am in the process of writing a guidebook on the topic of linked data and archives. In the guidebook I will elaborate on this recipe and provide instructions for its implementation. [1] [1] guidebook - http://sites.tufts.edu/liam/ -- Eric Lease Morgan University of Notre Dame -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] linked data recipe
Hi Eric, while I also think this is not rocket surgery, I'd like to point out that trial (and potentially error) as suggested by your "go back to step #1" instructions is not a good solution to coming up with URIs. I think once published - i.e. put on a webserver - you should be able to keep the URIs in your RDF persistent. Otherwise you are polluting the Semantic Web with dead links and make it hard for aggregators to find out whether the data they harvested is still valid. So while iterative approaches are pragmatic and often work out well, for the particular issue of coming up with URIs I'd recommend spending as much thought before publishing as you can spend. best robert On Tue, Nov 19, 2013 at 2:04 PM, Eric Lease Morgan wrote: > I believe participating in the Semantic Web and providing content via the > principles of linked data is not "rocket surgery", especially for cultural > heritage institutions -- libraries, archives, and museums. Here is a simple > recipe for their participation: > > 1. use existing metadata standards (MARC, EAD, etc.) to describe > collections > > 2. use any number of existing tools to convert the metadata to > HTML, and save the HTML on a Web server > > 3. use any number of existing tools to convert the metadata to > RDF/XML (or some other "serialization" of RDF), and save the > RDF/XML on a Web server > > 4. rest, congratulate yourself, and share your experience with > others in your domain > > 5. after the first time though, go back to Step #1, but this time > work with other people inside your domain making sure you use as > many of the same URIs as possible > > 6. after the second time through, go back to Step #1, but this > time supplement access to your linked data with a triple store, > thus supporting search > > 7. after the third time through, go back to Step #1, but this > time use any number of existing tools to expose the content in > your other information systems (relational databases, OAI-PMH > data repositories, etc.) > > 8. for dessert, cogitate ways to exploit the linked data in your > domain to discover new and additional relationships between URIs, > and thus make the Semantic Web more of a reality > > What do you think? > > I am in the process of writing a guidebook on the topic of linked data and > archives. In the guidebook I will elaborate on this recipe and provide > instructions for its implementation. [1] > > [1] guidebook - http://sites.tufts.edu/liam/ > > -- > Eric Lease Morgan > University of Notre Dame >
Re: [CODE4LIB] linked data recipe
It's a great start Eric. It helps me think that I can do it. Looking forward to more. Brian Zelip UIUC On Tue, Nov 19, 2013 at 7:04 AM, Eric Lease Morgan wrote: > I believe participating in the Semantic Web and providing content via the > principles of linked data is not "rocket surgery", especially for cultural > heritage institutions -- libraries, archives, and museums. Here is a simple > recipe for their participation: > > 1. use existing metadata standards (MARC, EAD, etc.) to describe > collections > > 2. use any number of existing tools to convert the metadata to > HTML, and save the HTML on a Web server > > 3. use any number of existing tools to convert the metadata to > RDF/XML (or some other "serialization" of RDF), and save the > RDF/XML on a Web server > > 4. rest, congratulate yourself, and share your experience with > others in your domain > > 5. after the first time though, go back to Step #1, but this time > work with other people inside your domain making sure you use as > many of the same URIs as possible > > 6. after the second time through, go back to Step #1, but this > time supplement access to your linked data with a triple store, > thus supporting search > > 7. after the third time through, go back to Step #1, but this > time use any number of existing tools to expose the content in > your other information systems (relational databases, OAI-PMH > data repositories, etc.) > > 8. for dessert, cogitate ways to exploit the linked data in your > domain to discover new and additional relationships between URIs, > and thus make the Semantic Web more of a reality > > What do you think? > > I am in the process of writing a guidebook on the topic of linked data and > archives. In the guidebook I will elaborate on this recipe and provide > instructions for its implementation. [1] > > [1] guidebook - http://sites.tufts.edu/liam/ > > -- > Eric Lease Morgan > University of Notre Dame >
[CODE4LIB] linked data recipe
I believe participating in the Semantic Web and providing content via the principles of linked data is not "rocket surgery", especially for cultural heritage institutions -- libraries, archives, and museums. Here is a simple recipe for their participation: 1. use existing metadata standards (MARC, EAD, etc.) to describe collections 2. use any number of existing tools to convert the metadata to HTML, and save the HTML on a Web server 3. use any number of existing tools to convert the metadata to RDF/XML (or some other "serialization" of RDF), and save the RDF/XML on a Web server 4. rest, congratulate yourself, and share your experience with others in your domain 5. after the first time though, go back to Step #1, but this time work with other people inside your domain making sure you use as many of the same URIs as possible 6. after the second time through, go back to Step #1, but this time supplement access to your linked data with a triple store, thus supporting search 7. after the third time through, go back to Step #1, but this time use any number of existing tools to expose the content in your other information systems (relational databases, OAI-PMH data repositories, etc.) 8. for dessert, cogitate ways to exploit the linked data in your domain to discover new and additional relationships between URIs, and thus make the Semantic Web more of a reality What do you think? I am in the process of writing a guidebook on the topic of linked data and archives. In the guidebook I will elaborate on this recipe and provide instructions for its implementation. [1] [1] guidebook - http://sites.tufts.edu/liam/ -- Eric Lease Morgan University of Notre Dame