[CODE4LIB] linked data recipe

2013-11-19 Thread Eric Lease Morgan
I believe participating in the Semantic Web and providing content via the 
principles of linked data is not rocket surgery, especially for cultural 
heritage institutions -- libraries, archives, and museums. Here is a simple 
recipe for their participation:

  1. use existing metadata standards (MARC, EAD, etc.) to describe
 collections

  2. use any number of existing tools to convert the metadata to
 HTML, and save the HTML on a Web server

  3. use any number of existing tools to convert the metadata to
 RDF/XML (or some other serialization of RDF), and save the
 RDF/XML on a Web server

  4. rest, congratulate yourself, and share your experience with
 others in your domain

  5. after the first time though, go back to Step #1, but this time
 work with other people inside your domain making sure you use as
 many of the same URIs as possible

  6. after the second time through, go back to Step #1, but this
 time supplement access to your linked data with a triple store,
 thus supporting search

  7. after the third time through, go back to Step #1, but this
 time use any number of existing tools to expose the content in
 your other information systems (relational databases, OAI-PMH
 data repositories, etc.)

  8. for dessert, cogitate ways to exploit the linked data in your
 domain to discover new and additional relationships between URIs,
 and thus make the Semantic Web more of a reality 

What do you think?

I am in the process of writing a guidebook on the topic of linked data and 
archives. In the guidebook I will elaborate on this recipe and provide 
instructions for its implementation. [1]

[1] guidebook - http://sites.tufts.edu/liam/

--
Eric Lease Morgan
University of Notre Dame


Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Brian Zelip
It's a great start Eric. It helps me think that I can do it.  Looking
forward to more.

Brian Zelip
UIUC


On Tue, Nov 19, 2013 at 7:04 AM, Eric Lease Morgan emor...@nd.edu wrote:

 I believe participating in the Semantic Web and providing content via the
 principles of linked data is not rocket surgery, especially for cultural
 heritage institutions -- libraries, archives, and museums. Here is a simple
 recipe for their participation:

   1. use existing metadata standards (MARC, EAD, etc.) to describe
  collections

   2. use any number of existing tools to convert the metadata to
  HTML, and save the HTML on a Web server

   3. use any number of existing tools to convert the metadata to
  RDF/XML (or some other serialization of RDF), and save the
  RDF/XML on a Web server

   4. rest, congratulate yourself, and share your experience with
  others in your domain

   5. after the first time though, go back to Step #1, but this time
  work with other people inside your domain making sure you use as
  many of the same URIs as possible

   6. after the second time through, go back to Step #1, but this
  time supplement access to your linked data with a triple store,
  thus supporting search

   7. after the third time through, go back to Step #1, but this
  time use any number of existing tools to expose the content in
  your other information systems (relational databases, OAI-PMH
  data repositories, etc.)

   8. for dessert, cogitate ways to exploit the linked data in your
  domain to discover new and additional relationships between URIs,
  and thus make the Semantic Web more of a reality

 What do you think?

 I am in the process of writing a guidebook on the topic of linked data and
 archives. In the guidebook I will elaborate on this recipe and provide
 instructions for its implementation. [1]

 [1] guidebook - http://sites.tufts.edu/liam/

 --
 Eric Lease Morgan
 University of Notre Dame



Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Robert Forkel
Hi Eric,
while I also think this is not rocket surgery, I'd like to point out that
trial (and potentially error) as suggested by your go back to step #1
instructions is not a good solution to coming up with URIs. I think once
published - i.e. put on a webserver - you should be able to keep the URIs
in your RDF persistent. Otherwise you are polluting the Semantic Web with
dead links and make it hard for aggregators to find out whether the data
they harvested is still valid.
So while iterative approaches are pragmatic and often work out well, for
the particular issue of coming up with URIs I'd recommend spending as much
thought before publishing as you can spend.
best
robert



On Tue, Nov 19, 2013 at 2:04 PM, Eric Lease Morgan emor...@nd.edu wrote:

 I believe participating in the Semantic Web and providing content via the
 principles of linked data is not rocket surgery, especially for cultural
 heritage institutions -- libraries, archives, and museums. Here is a simple
 recipe for their participation:

   1. use existing metadata standards (MARC, EAD, etc.) to describe
  collections

   2. use any number of existing tools to convert the metadata to
  HTML, and save the HTML on a Web server

   3. use any number of existing tools to convert the metadata to
  RDF/XML (or some other serialization of RDF), and save the
  RDF/XML on a Web server

   4. rest, congratulate yourself, and share your experience with
  others in your domain

   5. after the first time though, go back to Step #1, but this time
  work with other people inside your domain making sure you use as
  many of the same URIs as possible

   6. after the second time through, go back to Step #1, but this
  time supplement access to your linked data with a triple store,
  thus supporting search

   7. after the third time through, go back to Step #1, but this
  time use any number of existing tools to expose the content in
  your other information systems (relational databases, OAI-PMH
  data repositories, etc.)

   8. for dessert, cogitate ways to exploit the linked data in your
  domain to discover new and additional relationships between URIs,
  and thus make the Semantic Web more of a reality

 What do you think?

 I am in the process of writing a guidebook on the topic of linked data and
 archives. In the guidebook I will elaborate on this recipe and provide
 instructions for its implementation. [1]

 [1] guidebook - http://sites.tufts.edu/liam/

 --
 Eric Lease Morgan
 University of Notre Dame



Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Karen Coyle
Eric, I think this skips a step - which is the design step in which you 
create a domain model that uses linked data as its basis. RDF is not a 
serialization; it actually may require you to re-think the basic 
structure of your metadata. The reason for that is that it provides 
capabilities that record-based data models do not. Rather than starting 
with current metadata, you need to take a step back and ask: what does 
my information world look like as linked data?


I repeat: RDF is NOT A SERIALIZATION.

kc

On 11/19/13 5:04 AM, Eric Lease Morgan wrote:

I believe participating in the Semantic Web and providing content via the principles of 
linked data is not rocket surgery, especially for cultural heritage 
institutions -- libraries, archives, and museums. Here is a simple recipe for their 
participation:

   1. use existing metadata standards (MARC, EAD, etc.) to describe
  collections

   2. use any number of existing tools to convert the metadata to
  HTML, and save the HTML on a Web server

   3. use any number of existing tools to convert the metadata to
  RDF/XML (or some other serialization of RDF), and save the
  RDF/XML on a Web server

   4. rest, congratulate yourself, and share your experience with
  others in your domain

   5. after the first time though, go back to Step #1, but this time
  work with other people inside your domain making sure you use as
  many of the same URIs as possible

   6. after the second time through, go back to Step #1, but this
  time supplement access to your linked data with a triple store,
  thus supporting search

   7. after the third time through, go back to Step #1, but this
  time use any number of existing tools to expose the content in
  your other information systems (relational databases, OAI-PMH
  data repositories, etc.)

   8. for dessert, cogitate ways to exploit the linked data in your
  domain to discover new and additional relationships between URIs,
  and thus make the Semantic Web more of a reality

What do you think?

I am in the process of writing a guidebook on the topic of linked data and 
archives. In the guidebook I will elaborate on this recipe and provide 
instructions for its implementation. [1]

[1] guidebook - http://sites.tufts.edu/liam/

--
Eric Lease Morgan
University of Notre Dame


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Aaron Rubinstein
I think you’ve hit the nail on the head here, Karen. I would just add, or maybe 
reassure, that this does not necessarily require rethinking your existing 
metadata but how to translate that existing metadata into a linked data 
environment. Though this might seem like a pain, in many cases it will actually 
inspire you to go back and improve/increase the value of that existing metadata.

This is definitely looking awesome, Eric!

Aaron

On Nov 19, 2013, at 9:41 AM, Karen Coyle li...@kcoyle.net wrote:

 Eric, I think this skips a step - which is the design step in which you 
 create a domain model that uses linked data as its basis. RDF is not a 
 serialization; it actually may require you to re-think the basic structure of 
 your metadata. The reason for that is that it provides capabilities that 
 record-based data models do not. Rather than starting with current metadata, 
 you need to take a step back and ask: what does my information world look 
 like as linked data?
 
 I repeat: RDF is NOT A SERIALIZATION.
 
 kc
 
 On 11/19/13 5:04 AM, Eric Lease Morgan wrote:
 I believe participating in the Semantic Web and providing content via the 
 principles of linked data is not rocket surgery, especially for cultural 
 heritage institutions -- libraries, archives, and museums. Here is a simple 
 recipe for their participation:
 
   1. use existing metadata standards (MARC, EAD, etc.) to describe
  collections
 
   2. use any number of existing tools to convert the metadata to
  HTML, and save the HTML on a Web server
 
   3. use any number of existing tools to convert the metadata to
  RDF/XML (or some other serialization of RDF), and save the
  RDF/XML on a Web server
 
   4. rest, congratulate yourself, and share your experience with
  others in your domain
 
   5. after the first time though, go back to Step #1, but this time
  work with other people inside your domain making sure you use as
  many of the same URIs as possible
 
   6. after the second time through, go back to Step #1, but this
  time supplement access to your linked data with a triple store,
  thus supporting search
 
   7. after the third time through, go back to Step #1, but this
  time use any number of existing tools to expose the content in
  your other information systems (relational databases, OAI-PMH
  data repositories, etc.)
 
   8. for dessert, cogitate ways to exploit the linked data in your
  domain to discover new and additional relationships between URIs,
  and thus make the Semantic Web more of a reality
 
 What do you think?
 
 I am in the process of writing a guidebook on the topic of linked data and 
 archives. In the guidebook I will elaborate on this recipe and provide 
 instructions for its implementation. [1]
 
 [1] guidebook - http://sites.tufts.edu/liam/
 
 --
 Eric Lease Morgan
 University of Notre Dame
 
 -- 
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: 1-510-435-8234
 skype: kcoylenet


Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Ethan Gruber
I'm not sure that I agree that RDF is not a serialization.  It really
depends on the context of the system and intended use of the linked data.
For example, TEI is designed with a specific purpose which cannot be
replicated in RDF (at least, not very easily at all), but deriving RDF from
highly-linked TEI to put into an endpoint can open doors to queries which
are otherwise impossible to make on the data.  This certainly requires some
rethinking of the way texts interact.  But perhaps it may be best to say
that RDF *can* (but not necessarily) be a derivation, rather than
serialization, of some larger, more complex canonical data model.

Ethan


On Tue, Nov 19, 2013 at 9:54 AM, Aaron Rubinstein 
arubi...@library.umass.edu wrote:

 I think you’ve hit the nail on the head here, Karen. I would just add, or
 maybe reassure, that this does not necessarily require rethinking your
 existing metadata but how to translate that existing metadata into a linked
 data environment. Though this might seem like a pain, in many cases it will
 actually inspire you to go back and improve/increase the value of that
 existing metadata.

 This is definitely looking awesome, Eric!

 Aaron

 On Nov 19, 2013, at 9:41 AM, Karen Coyle li...@kcoyle.net wrote:

  Eric, I think this skips a step - which is the design step in which you
 create a domain model that uses linked data as its basis. RDF is not a
 serialization; it actually may require you to re-think the basic structure
 of your metadata. The reason for that is that it provides capabilities that
 record-based data models do not. Rather than starting with current
 metadata, you need to take a step back and ask: what does my information
 world look like as linked data?
 
  I repeat: RDF is NOT A SERIALIZATION.
 
  kc
 
  On 11/19/13 5:04 AM, Eric Lease Morgan wrote:
  I believe participating in the Semantic Web and providing content via
 the principles of linked data is not rocket surgery, especially for
 cultural heritage institutions -- libraries, archives, and museums. Here is
 a simple recipe for their participation:
 
1. use existing metadata standards (MARC, EAD, etc.) to describe
   collections
 
2. use any number of existing tools to convert the metadata to
   HTML, and save the HTML on a Web server
 
3. use any number of existing tools to convert the metadata to
   RDF/XML (or some other serialization of RDF), and save the
   RDF/XML on a Web server
 
4. rest, congratulate yourself, and share your experience with
   others in your domain
 
5. after the first time though, go back to Step #1, but this time
   work with other people inside your domain making sure you use as
   many of the same URIs as possible
 
6. after the second time through, go back to Step #1, but this
   time supplement access to your linked data with a triple store,
   thus supporting search
 
7. after the third time through, go back to Step #1, but this
   time use any number of existing tools to expose the content in
   your other information systems (relational databases, OAI-PMH
   data repositories, etc.)
 
8. for dessert, cogitate ways to exploit the linked data in your
   domain to discover new and additional relationships between URIs,
   and thus make the Semantic Web more of a reality
 
  What do you think?
 
  I am in the process of writing a guidebook on the topic of linked data
 and archives. In the guidebook I will elaborate on this recipe and provide
 instructions for its implementation. [1]
 
  [1] guidebook - http://sites.tufts.edu/liam/
 
  --
  Eric Lease Morgan
  University of Notre Dame
 
  --
  Karen Coyle
  kco...@kcoyle.net http://kcoyle.net
  m: 1-510-435-8234
  skype: kcoylenet



Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Ross Singer
That's still not a serialization.  It's just a similar data model.
 Pretty huge difference.

-Ross.


On Tue, Nov 19, 2013 at 10:31 AM, Ethan Gruber ewg4x...@gmail.com wrote:

 I'm not sure that I agree that RDF is not a serialization.  It really
 depends on the context of the system and intended use of the linked data.
 For example, TEI is designed with a specific purpose which cannot be
 replicated in RDF (at least, not very easily at all), but deriving RDF from
 highly-linked TEI to put into an endpoint can open doors to queries which
 are otherwise impossible to make on the data.  This certainly requires some
 rethinking of the way texts interact.  But perhaps it may be best to say
 that RDF *can* (but not necessarily) be a derivation, rather than
 serialization, of some larger, more complex canonical data model.

 Ethan


 On Tue, Nov 19, 2013 at 9:54 AM, Aaron Rubinstein 
 arubi...@library.umass.edu wrote:

  I think you’ve hit the nail on the head here, Karen. I would just add, or
  maybe reassure, that this does not necessarily require rethinking your
  existing metadata but how to translate that existing metadata into a
 linked
  data environment. Though this might seem like a pain, in many cases it
 will
  actually inspire you to go back and improve/increase the value of that
  existing metadata.
 
  This is definitely looking awesome, Eric!
 
  Aaron
 
  On Nov 19, 2013, at 9:41 AM, Karen Coyle li...@kcoyle.net wrote:
 
   Eric, I think this skips a step - which is the design step in which you
  create a domain model that uses linked data as its basis. RDF is not a
  serialization; it actually may require you to re-think the basic
 structure
  of your metadata. The reason for that is that it provides capabilities
 that
  record-based data models do not. Rather than starting with current
  metadata, you need to take a step back and ask: what does my information
  world look like as linked data?
  
   I repeat: RDF is NOT A SERIALIZATION.
  
   kc
  
   On 11/19/13 5:04 AM, Eric Lease Morgan wrote:
   I believe participating in the Semantic Web and providing content via
  the principles of linked data is not rocket surgery, especially for
  cultural heritage institutions -- libraries, archives, and museums. Here
 is
  a simple recipe for their participation:
  
 1. use existing metadata standards (MARC, EAD, etc.) to describe
collections
  
 2. use any number of existing tools to convert the metadata to
HTML, and save the HTML on a Web server
  
 3. use any number of existing tools to convert the metadata to
RDF/XML (or some other serialization of RDF), and save the
RDF/XML on a Web server
  
 4. rest, congratulate yourself, and share your experience with
others in your domain
  
 5. after the first time though, go back to Step #1, but this time
work with other people inside your domain making sure you use as
many of the same URIs as possible
  
 6. after the second time through, go back to Step #1, but this
time supplement access to your linked data with a triple store,
thus supporting search
  
 7. after the third time through, go back to Step #1, but this
time use any number of existing tools to expose the content in
your other information systems (relational databases, OAI-PMH
data repositories, etc.)
  
 8. for dessert, cogitate ways to exploit the linked data in your
domain to discover new and additional relationships between URIs,
and thus make the Semantic Web more of a reality
  
   What do you think?
  
   I am in the process of writing a guidebook on the topic of linked data
  and archives. In the guidebook I will elaborate on this recipe and
 provide
  instructions for its implementation. [1]
  
   [1] guidebook - http://sites.tufts.edu/liam/
  
   --
   Eric Lease Morgan
   University of Notre Dame
  
   --
   Karen Coyle
   kco...@kcoyle.net http://kcoyle.net
   m: 1-510-435-8234
   skype: kcoylenet
 



Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Ethan Gruber
I see that serialization has a different definition in computer science
than I thought it did.


On Tue, Nov 19, 2013 at 10:36 AM, Ross Singer rossfsin...@gmail.com wrote:

 That's still not a serialization.  It's just a similar data model.
  Pretty huge difference.

 -Ross.


 On Tue, Nov 19, 2013 at 10:31 AM, Ethan Gruber ewg4x...@gmail.com wrote:

  I'm not sure that I agree that RDF is not a serialization.  It really
  depends on the context of the system and intended use of the linked data.
  For example, TEI is designed with a specific purpose which cannot be
  replicated in RDF (at least, not very easily at all), but deriving RDF
 from
  highly-linked TEI to put into an endpoint can open doors to queries which
  are otherwise impossible to make on the data.  This certainly requires
 some
  rethinking of the way texts interact.  But perhaps it may be best to say
  that RDF *can* (but not necessarily) be a derivation, rather than
  serialization, of some larger, more complex canonical data model.
 
  Ethan
 
 
  On Tue, Nov 19, 2013 at 9:54 AM, Aaron Rubinstein 
  arubi...@library.umass.edu wrote:
 
   I think you’ve hit the nail on the head here, Karen. I would just add,
 or
   maybe reassure, that this does not necessarily require rethinking your
   existing metadata but how to translate that existing metadata into a
  linked
   data environment. Though this might seem like a pain, in many cases it
  will
   actually inspire you to go back and improve/increase the value of that
   existing metadata.
  
   This is definitely looking awesome, Eric!
  
   Aaron
  
   On Nov 19, 2013, at 9:41 AM, Karen Coyle li...@kcoyle.net wrote:
  
Eric, I think this skips a step - which is the design step in which
 you
   create a domain model that uses linked data as its basis. RDF is not a
   serialization; it actually may require you to re-think the basic
  structure
   of your metadata. The reason for that is that it provides capabilities
  that
   record-based data models do not. Rather than starting with current
   metadata, you need to take a step back and ask: what does my
 information
   world look like as linked data?
   
I repeat: RDF is NOT A SERIALIZATION.
   
kc
   
On 11/19/13 5:04 AM, Eric Lease Morgan wrote:
I believe participating in the Semantic Web and providing content
 via
   the principles of linked data is not rocket surgery, especially for
   cultural heritage institutions -- libraries, archives, and museums.
 Here
  is
   a simple recipe for their participation:
   
  1. use existing metadata standards (MARC, EAD, etc.) to describe
 collections
   
  2. use any number of existing tools to convert the metadata to
 HTML, and save the HTML on a Web server
   
  3. use any number of existing tools to convert the metadata to
 RDF/XML (or some other serialization of RDF), and save the
 RDF/XML on a Web server
   
  4. rest, congratulate yourself, and share your experience with
 others in your domain
   
  5. after the first time though, go back to Step #1, but this time
 work with other people inside your domain making sure you use
 as
 many of the same URIs as possible
   
  6. after the second time through, go back to Step #1, but this
 time supplement access to your linked data with a triple store,
 thus supporting search
   
  7. after the third time through, go back to Step #1, but this
 time use any number of existing tools to expose the content in
 your other information systems (relational databases, OAI-PMH
 data repositories, etc.)
   
  8. for dessert, cogitate ways to exploit the linked data in your
 domain to discover new and additional relationships between
 URIs,
 and thus make the Semantic Web more of a reality
   
What do you think?
   
I am in the process of writing a guidebook on the topic of linked
 data
   and archives. In the guidebook I will elaborate on this recipe and
  provide
   instructions for its implementation. [1]
   
[1] guidebook - http://sites.tufts.edu/liam/
   
--
Eric Lease Morgan
University of Notre Dame
   
--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet
  
 



Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Eric Lease Morgan
On Nov 19, 2013, at 8:48 AM, Robert Forkel xrotw...@googlemail.com wrote:

 while I also think this is not rocket surgery, I'd like to point out that
 trial (and potentially error) as suggested by your go back to step #1
 instructions is not a good solution to coming up with URIs. I think once
 published - i.e. put on a webserver - you should be able to keep the URIs
 in your RDF persistent. Otherwise you are polluting the Semantic Web with
 dead links and make it hard for aggregators to find out whether the data
 they harvested is still valid.
 
 So while iterative approaches are pragmatic and often work out well, for
 the particular issue of coming up with URIs I'd recommend spending as much
 thought before publishing as you can spend.


Intellectually, I completely understand.

Practically, I still advocate putting publishing the linked data as soon as 
possible. Knowledge is refined over time. The data being published is not 
incorrect nor invalid, just not as good as it could be. Data aggregators will 
refresh their stores and old information will go to Big Byte Heaven”. It is 
just like a library collection. The “best” books are collected. The good ones 
get used. The old ones get weeded or relegated to off-site storage. What 
remains is a current perception of truth. Building library collections is a 
process that is never done nor never perfect. Linked data is a literal 
reflection of library collections, therefore linked data is never done nor 
never perfect either. URIs will break. Books will be removed from the 
collection. URIs will go stale. 

The process of providing linked data is a lot like painting a painting. The 
painting is painted as a whole, from start to finish. One does not get one 
corner of the canvass perfect and move on from there. An idea is articulated. 
An outlined is drawn. The outline is refined, and the painting gradually comes 
to life. Many times paintings are never finished but worked, reworked, and 
worked some more. 

If the profession looks to make perfect its list of URIs, then it will never 
leave the starting gate. I know that is not being advocated, but since one can 
not measure the timeless validity of a URI, I advocate that the current URIs 
are good enough. There is an understanding of a commitment to updating them and 
refining them in the future.

— 
Eric Morgan


Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Eric Lease Morgan
On Nov 19, 2013, at 9:41 AM, Karen Coyle li...@kcoyle.net wrote:

 Eric, I think this skips a step - which is the design step in which you 
 create a domain model that uses linked data as its basis. RDF is not a 
 serialization; it actually may require you to re-think the basic 
 structure of your metadata. The reason for that is that it provides 
 capabilities that record-based data models do not. Rather than starting 
 with current metadata, you need to take a step back and ask: what does 
 my information world look like as linked data?


I respectfully disagree. I do not think it necessary to create a domain model 
ahead of time; I do not think it is necessary for us to re-think our metadata 
structures. There already exists tools enabling us — cultural heritage 
institutions — to manifest our metadata as RDF. The manifestations may not be 
perfect, but “we need to learn to walk before we run” and the metadata 
structures we have right now will work for right now. As we mature we can 
refine our processes. I do not advocate “stepping back and asking”. I advocate 
looking forward and doing. —Eric Morgan


Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Ross Singer
I don't know what your definition of serialization is, but I don't know
of any where data model and formatted output of a data model are
synonymous.

RDF is a data model *not* a serialization.

-Ross.


On Tue, Nov 19, 2013 at 10:45 AM, Ethan Gruber ewg4x...@gmail.com wrote:

 I see that serialization has a different definition in computer science
 than I thought it did.


 On Tue, Nov 19, 2013 at 10:36 AM, Ross Singer rossfsin...@gmail.com
 wrote:

  That's still not a serialization.  It's just a similar data model.
   Pretty huge difference.
 
  -Ross.
 
 
  On Tue, Nov 19, 2013 at 10:31 AM, Ethan Gruber ewg4x...@gmail.com
 wrote:
 
   I'm not sure that I agree that RDF is not a serialization.  It really
   depends on the context of the system and intended use of the linked
 data.
   For example, TEI is designed with a specific purpose which cannot be
   replicated in RDF (at least, not very easily at all), but deriving RDF
  from
   highly-linked TEI to put into an endpoint can open doors to queries
 which
   are otherwise impossible to make on the data.  This certainly requires
  some
   rethinking of the way texts interact.  But perhaps it may be best to
 say
   that RDF *can* (but not necessarily) be a derivation, rather than
   serialization, of some larger, more complex canonical data model.
  
   Ethan
  
  
   On Tue, Nov 19, 2013 at 9:54 AM, Aaron Rubinstein 
   arubi...@library.umass.edu wrote:
  
I think you’ve hit the nail on the head here, Karen. I would just
 add,
  or
maybe reassure, that this does not necessarily require rethinking
 your
existing metadata but how to translate that existing metadata into a
   linked
data environment. Though this might seem like a pain, in many cases
 it
   will
actually inspire you to go back and improve/increase the value of
 that
existing metadata.
   
This is definitely looking awesome, Eric!
   
Aaron
   
On Nov 19, 2013, at 9:41 AM, Karen Coyle li...@kcoyle.net wrote:
   
 Eric, I think this skips a step - which is the design step in which
  you
create a domain model that uses linked data as its basis. RDF is not
 a
serialization; it actually may require you to re-think the basic
   structure
of your metadata. The reason for that is that it provides
 capabilities
   that
record-based data models do not. Rather than starting with current
metadata, you need to take a step back and ask: what does my
  information
world look like as linked data?

 I repeat: RDF is NOT A SERIALIZATION.

 kc

 On 11/19/13 5:04 AM, Eric Lease Morgan wrote:
 I believe participating in the Semantic Web and providing content
  via
the principles of linked data is not rocket surgery, especially for
cultural heritage institutions -- libraries, archives, and museums.
  Here
   is
a simple recipe for their participation:

   1. use existing metadata standards (MARC, EAD, etc.) to describe
  collections

   2. use any number of existing tools to convert the metadata to
  HTML, and save the HTML on a Web server

   3. use any number of existing tools to convert the metadata to
  RDF/XML (or some other serialization of RDF), and save the
  RDF/XML on a Web server

   4. rest, congratulate yourself, and share your experience with
  others in your domain

   5. after the first time though, go back to Step #1, but this
 time
  work with other people inside your domain making sure you use
  as
  many of the same URIs as possible

   6. after the second time through, go back to Step #1, but this
  time supplement access to your linked data with a triple
 store,
  thus supporting search

   7. after the third time through, go back to Step #1, but this
  time use any number of existing tools to expose the content
 in
  your other information systems (relational databases, OAI-PMH
  data repositories, etc.)

   8. for dessert, cogitate ways to exploit the linked data in your
  domain to discover new and additional relationships between
  URIs,
  and thus make the Semantic Web more of a reality

 What do you think?

 I am in the process of writing a guidebook on the topic of linked
  data
and archives. In the guidebook I will elaborate on this recipe and
   provide
instructions for its implementation. [1]

 [1] guidebook - http://sites.tufts.edu/liam/

 --
 Eric Lease Morgan
 University of Notre Dame

 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: 1-510-435-8234
 skype: kcoylenet
   
  
 



Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Ethan Gruber
yo, i get it


On Tue, Nov 19, 2013 at 10:54 AM, Ross Singer rossfsin...@gmail.com wrote:

 I don't know what your definition of serialization is, but I don't know
 of any where data model and formatted output of a data model are
 synonymous.

 RDF is a data model *not* a serialization.

 -Ross.


 On Tue, Nov 19, 2013 at 10:45 AM, Ethan Gruber ewg4x...@gmail.com wrote:

  I see that serialization has a different definition in computer science
  than I thought it did.
 
 
  On Tue, Nov 19, 2013 at 10:36 AM, Ross Singer rossfsin...@gmail.com
  wrote:
 
   That's still not a serialization.  It's just a similar data model.
Pretty huge difference.
  
   -Ross.
  
  
   On Tue, Nov 19, 2013 at 10:31 AM, Ethan Gruber ewg4x...@gmail.com
  wrote:
  
I'm not sure that I agree that RDF is not a serialization.  It really
depends on the context of the system and intended use of the linked
  data.
For example, TEI is designed with a specific purpose which cannot be
replicated in RDF (at least, not very easily at all), but deriving
 RDF
   from
highly-linked TEI to put into an endpoint can open doors to queries
  which
are otherwise impossible to make on the data.  This certainly
 requires
   some
rethinking of the way texts interact.  But perhaps it may be best to
  say
that RDF *can* (but not necessarily) be a derivation, rather than
serialization, of some larger, more complex canonical data model.
   
Ethan
   
   
On Tue, Nov 19, 2013 at 9:54 AM, Aaron Rubinstein 
arubi...@library.umass.edu wrote:
   
 I think you’ve hit the nail on the head here, Karen. I would just
  add,
   or
 maybe reassure, that this does not necessarily require rethinking
  your
 existing metadata but how to translate that existing metadata into
 a
linked
 data environment. Though this might seem like a pain, in many cases
  it
will
 actually inspire you to go back and improve/increase the value of
  that
 existing metadata.

 This is definitely looking awesome, Eric!

 Aaron

 On Nov 19, 2013, at 9:41 AM, Karen Coyle li...@kcoyle.net wrote:

  Eric, I think this skips a step - which is the design step in
 which
   you
 create a domain model that uses linked data as its basis. RDF is
 not
  a
 serialization; it actually may require you to re-think the basic
structure
 of your metadata. The reason for that is that it provides
  capabilities
that
 record-based data models do not. Rather than starting with current
 metadata, you need to take a step back and ask: what does my
   information
 world look like as linked data?
 
  I repeat: RDF is NOT A SERIALIZATION.
 
  kc
 
  On 11/19/13 5:04 AM, Eric Lease Morgan wrote:
  I believe participating in the Semantic Web and providing
 content
   via
 the principles of linked data is not rocket surgery, especially
 for
 cultural heritage institutions -- libraries, archives, and museums.
   Here
is
 a simple recipe for their participation:
 
1. use existing metadata standards (MARC, EAD, etc.) to
 describe
   collections
 
2. use any number of existing tools to convert the metadata to
   HTML, and save the HTML on a Web server
 
3. use any number of existing tools to convert the metadata to
   RDF/XML (or some other serialization of RDF), and save
 the
   RDF/XML on a Web server
 
4. rest, congratulate yourself, and share your experience with
   others in your domain
 
5. after the first time though, go back to Step #1, but this
  time
   work with other people inside your domain making sure you
 use
   as
   many of the same URIs as possible
 
6. after the second time through, go back to Step #1, but this
   time supplement access to your linked data with a triple
  store,
   thus supporting search
 
7. after the third time through, go back to Step #1, but this
   time use any number of existing tools to expose the content
  in
   your other information systems (relational databases,
 OAI-PMH
   data repositories, etc.)
 
8. for dessert, cogitate ways to exploit the linked data in
 your
   domain to discover new and additional relationships between
   URIs,
   and thus make the Semantic Web more of a reality
 
  What do you think?
 
  I am in the process of writing a guidebook on the topic of
 linked
   data
 and archives. In the guidebook I will elaborate on this recipe and
provide
 instructions for its implementation. [1]
 
  [1] guidebook - http://sites.tufts.edu/liam/
 
  --
  Eric Lease Morgan
  University of Notre Dame
 
  --
  Karen Coyle
  kco...@kcoyle.net http://kcoyle.net
  m: 1-510-435-8234
  skype: kcoylenet

   
  
 

Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Karen Coyle
Eric, if you want to leap into the linked data world in the fastest, 
easiest way possible, then I suggest looking at microdata markup, e.g. 
schema.org.[1] Schema.org does not require you to transform your data at 
all: it only requires mark-up of your online displays. This makes sense 
because as long as your data is in local databases, it's not visible to 
the linked data universe anyway; so why not take the easy way out and 
just add linked data to your public online displays? This doesn't 
require a transformation of your entire record (some of which may not be 
suitable as linked data in any case), only those things that are 
likely to link usefully. This latter generally means things for which 
you have an identifier. And you make no changes to your database, only 
to display.


OCLC is already producing this markup in WorldCat records [2]-- not 
perfectly, of course, lots of warts, but it is a first step. However, it 
is a first step that makes more sense to me than *transforming* or 
*cross-walking* current metadata. It also, I believe, will help us 
understand what bits of our current metadata will make the transition to 
linked data, and what bits should remain as accessible documents that 
users can reach through linked data.


kc
[1] http://schema.org, and look at the work going on to add 
bibliographic properties at 
http://www.w3.org/community/schemabibex/wiki/Main_Page
[2] look at the linked data section of any WorldCat page for a single 
item, such 
ashttp://www.worldcat.org/title/selection-of-early-statistical-papers-of-j-neyman/oclc/527725referer=brief_results




On 11/19/13 7:54 AM, Eric Lease Morgan wrote:

On Nov 19, 2013, at 9:41 AM, Karen Coyle li...@kcoyle.net wrote:


Eric, I think this skips a step - which is the design step in which you
create a domain model that uses linked data as its basis. RDF is not a
serialization; it actually may require you to re-think the basic
structure of your metadata. The reason for that is that it provides
capabilities that record-based data models do not. Rather than starting
with current metadata, you need to take a step back and ask: what does
my information world look like as linked data?


I respectfully disagree. I do not think it necessary to create a domain model 
ahead of time; I do not think it is necessary for us to re-think our metadata 
structures. There already exists tools enabling us — cultural heritage 
institutions — to manifest our metadata as RDF. The manifestations may not be 
perfect, but “we need to learn to walk before we run” and the metadata 
structures we have right now will work for right now. As we mature we can 
refine our processes. I do not advocate “stepping back and asking”. I advocate 
looking forward and doing. —Eric Morgan


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Ethan Gruber
Hasn't the pendulum swung back toward RDFa Lite (
http://www.w3.org/TR/rdfa-lite/) recently?  They are fairly equivalent, but
I'm not sure about all the politics involved.


On Tue, Nov 19, 2013 at 11:09 AM, Karen Coyle li...@kcoyle.net wrote:

 Eric, if you want to leap into the linked data world in the fastest,
 easiest way possible, then I suggest looking at microdata markup, e.g.
 schema.org.[1] Schema.org does not require you to transform your data at
 all: it only requires mark-up of your online displays. This makes sense
 because as long as your data is in local databases, it's not visible to the
 linked data universe anyway; so why not take the easy way out and just add
 linked data to your public online displays? This doesn't require a
 transformation of your entire record (some of which may not be suitable as
 linked data in any case), only those things that are likely to link
 usefully. This latter generally means things for which you have an
 identifier. And you make no changes to your database, only to display.

 OCLC is already producing this markup in WorldCat records [2]-- not
 perfectly, of course, lots of warts, but it is a first step. However, it is
 a first step that makes more sense to me than *transforming* or
 *cross-walking* current metadata. It also, I believe, will help us
 understand what bits of our current metadata will make the transition to
 linked data, and what bits should remain as accessible documents that users
 can reach through linked data.

 kc
 [1] http://schema.org, and look at the work going on to add bibliographic
 properties at http://www.w3.org/community/schemabibex/wiki/Main_Page
 [2] look at the linked data section of any WorldCat page for a single
 item, such ashttp://www.worldcat.org/title/selection-of-early-
 statistical-papers-of-j-neyman/oclc/527725referer=brief_results




 On 11/19/13 7:54 AM, Eric Lease Morgan wrote:

 On Nov 19, 2013, at 9:41 AM, Karen Coyle li...@kcoyle.net wrote:

  Eric, I think this skips a step - which is the design step in which you
 create a domain model that uses linked data as its basis. RDF is not a
 serialization; it actually may require you to re-think the basic
 structure of your metadata. The reason for that is that it provides
 capabilities that record-based data models do not. Rather than starting
 with current metadata, you need to take a step back and ask: what does
 my information world look like as linked data?


 I respectfully disagree. I do not think it necessary to create a domain
 model ahead of time; I do not think it is necessary for us to re-think our
 metadata structures. There already exists tools enabling us — cultural
 heritage institutions — to manifest our metadata as RDF. The manifestations
 may not be perfect, but “we need to learn to walk before we run” and the
 metadata structures we have right now will work for right now. As we mature
 we can refine our processes. I do not advocate “stepping back and asking”.
 I advocate looking forward and doing. —Eric Morgan


 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: 1-510-435-8234
 skype: kcoylenet



Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Eric Lease Morgan
On Nov 19, 2013, at 9:54 AM, Aaron Rubinstein arubi...@library.umass.edu 
wrote:

 I think you’ve hit the nail on the head here, Karen. I would just
 add, or maybe reassure, that this does not necessarily require
 rethinking your existing metadata but how to translate that
   ^
 existing metadata into a linked data environment. Though this
 
 might seem like a pain, in many cases it will actually inspire
 you to go back and improve/increase the value of that existing
 metadata...


There are tools allowing people to translate existing metadata into a linked 
data environment, and for right now, I advocate that they are good enough. I 
will provide simplistic examples.

For people who maintain MARC records:

  1. convert the MARC records to MARCXML with the MARCXML Toolkit [1]
  2. convert the MARCXML to RDF/XML in the manner of BIBFRAME’s transformation 
service [2]
  3. save the resulting RDF/XML on a Web server
  4. convert the MARC (or MARCXML) into (valid) HTML
  5. save the resulting HTML on a Web server
  6. for extra credit, implement a content negotiation service for the HTML and 
RDF/XML
  7. for extra extra credit, implement a SPARQL endpoint for your RDF

If one does Steps #1 through #5, then they are doing linked data and 
participating in the Semantic Web. That is the goal.

For people who maintain EAD files:

  1. transform the EAD files into RDF/XML with a stylesheet created by the 
Archives Hub [3]
  2. save the resulting RDF/XML on a Web server
  3. transform the EAD into HTML, using your favorite EAD to HTML stylesheet [4]
  4. save the resulting HTML on a Web server
  5. for extra credit, implement a content negotiation service for the HTML and 
RDF/XML
  6. for extra extra credit, implement a SPARQL endpoint for your RDF

If one does Steps #1 through #4 of this example, then they are doing linked 
data and participating in the Semantic Web. That is the goal.

In both examples the end result will be a valid linked data implementation. Not 
complete. Not necessarily as thorough as desired. Not necessarily as accurate 
as desired. But valid. Such a process will not expose false, incorrect 
data/information, but rather data/information that is intended to be 
maintained, improved, and updated on a continual basis.

Finally, I want to highlight a distinction between well-formed, valid, and 
accurate information — linked data. I will use XML as an example. XML can be 
“well-formed”. This means it is syntactically correct. Specific characters are 
represented by entities. Elements are correctly opened and closed. The whole 
structure has a single root. Etc. The next level up is “valid”. Valid XML is 
XML that conforms to a DTD or schema; it is semantically correct. It means that 
required elements exist, and are presented in a particular order. Specific 
attributes used in elements are denoted. And in the case of schemas, values in 
elements and attributes take on particular shapes beyond simple character data. 
Finally XML can be “accurate” (my term). This means the assertions in the XML 
are true. For example, there is nothing stopping me from putting the title of a 
work in an author element. How is the computer expected to know the difference? 
It can’t. Alternatively, the title could be presente!
 d as “Thee Adventrs Av Tom Sawher”, when the more accurate title may be “The 
Adventures of Tom Sawyer”. Well-formedness and validity is the domain of 
computers. Accuracy is the domain of humans. In the world of linked data, you 
are not participating if your published data is not “well-formed”. (Go back to 
start.) You are participating if it is “valid”. But you are really doing really 
well if the data is “accurate”. 

Let’s not make this more difficult than it really is.

[1] MARCXML Toolkit - linked at http://www.loc.gov/standards/marcxml/
[2] BIBFRAME’s transformation service - 
http://bibframe.org/tools/transform/start
[3] Archives Hub stylesheet - http://data.archiveshub.ac.uk/xslt/ead2rdf.xsl
[4] EAD to HTML - for example, 
http://www.catholicresearch.net/data/ead/ead2html.xsl

— 
Eric Morgan


Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Eric Lease Morgan
On Nov 19, 2013, at 11:09 AM, Karen Coyle li...@kcoyle.net wrote:

 Eric, if you want to leap into the linked data world in the fastest, 
 easiest way possible, then I suggest looking at microdata markup, e.g. 
 schema.org. [1] …
 
 [1] http://schema.org


I don’t advocate this as the fastest, easiest way possible because it forces 
RDF “aggregators” to parse HTML, and thus passes a level of complexity down the 
processing chain. Expose RDF as RDF, not embedded in another format. I do 
advocate the inclusion of schema.org mark-up, RDFa, etc. into HTML but rather 
as a level of refinement. —Eric Morgan


Re: [CODE4LIB] linked data recipe

2013-11-19 Thread Bigwood, David
+1 for schema.org as one of the first steps. COinS are another useful simple 
mark-up if the data is already there.

I'm looking forward to the book.

Sincerely,
David Bigwood
Lunar and Planetary Institute


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen 
Coyle
Sent: Tuesday, November 19, 2013 10:10 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] linked data recipe

Eric, if you want to leap into the linked data world in the fastest, easiest 
way possible, then I suggest looking at microdata markup, e.g. 
schema.org.[1] Schema.org does not require you to transform your data at
all: it only requires mark-up of your online displays. This makes sense because 
as long as your data is in local databases, it's not visible to the linked data 
universe anyway; so why not take the easy way out and just add linked data to 
your public online displays? This doesn't require a transformation of your 
entire record (some of which may not be suitable as linked data in any case), 
only those things that are likely to link usefully. This latter generally 
means things for which you have an identifier. And you make no changes to 
your database, only to display.

OCLC is already producing this markup in WorldCat records [2]-- not perfectly, 
of course, lots of warts, but it is a first step. However, it is a first step 
that makes more sense to me than *transforming* or
*cross-walking* current metadata. It also, I believe, will help us understand 
what bits of our current metadata will make the transition to linked data, and 
what bits should remain as accessible documents that users can reach through 
linked data.

kc
[1] http://schema.org, and look at the work going on to add bibliographic 
properties at http://www.w3.org/community/schemabibex/wiki/Main_Page
[2] look at the linked data section of any WorldCat page for a single item, 
such 
ashttp://www.worldcat.org/title/selection-of-early-statistical-papers-of-j-neyman/oclc/527725referer=brief_results