Re: [CODE4LIB] rdf serialization
Ethan, thanks, it's good to have examples. I'd say that for simple linking SPARQL may not be necessary, perhaps should be avoided, but IF you need something ELSE, say a query WHERE you have conditions, THEN you may find that a query language is needed. kc On 11/6/13 9:14 AM, Ethan Gruber wrote: I think that the answer to #1 is that if you want or expect people to use your endpoint that you should document how it works: the ontologies, the models, and a variety of example SPARQL queries, ranging from simple to complex. The British Museum's SPARQL endpoint ( http://collection.britishmuseum.org/sparql) is highly touted, but how many people actually use it? I understand your point about SPARQL being too complicated for an API interface, but the best examples of services built on SPARQL are probably the ones you don't even realize are built on SPARQL (e.g., http://numismatics.org/ocre/id/ric.1%282%29.aug.4A#mapTab). So on one hand, perhaps only the most dedicated and hardcore researchers will venture to construct SPARQL queries for your endpoint, but on the other, you can build some pretty visualizations based on SPARQL queries conducted in the background from the user's interaction with a simple html/javascript based interface. Ethan On Wed, Nov 6, 2013 at 11:54 AM, Ross Singer rossfsin...@gmail.com wrote: Hey Karen, It's purely anecdotal (albeit anecdotes borne from working at a company that offered, and has since abandoned, a sparql-based triple store service), but I just don't see the interest in arbitrary SPARQL queries against remote datasets that I do against linking to (and grabbing) known items. I think there are multiple reasons for this: 1) Unless you're already familiar with the dataset behind the SPARQL endpoint, where do you even start with constructing useful queries? 2) SPARQL as a query language is a combination of being too powerful and completely useless in practice: query timeouts are commonplace, endpoints don't support all of 1.1, etc. And, going back to point #1, it's hard to know how to optimize your queries unless you are already pretty familiar with the data 3) SPARQL is a flawed API interface from the get-go (IMHO) for the same reason we don't offer a public SQL interface to our RDBMSes Which isn't to say it doesn't have its uses or applications. I just think that in most cases domain/service-specific APIs (be they RESTful, based on the Linked Data API [0], whatever) will likely be favored over generic SPARQL endpoints. Are n+1 different APIs ideal? I am pretty sure the answer is no, but that's the future I foresee, personally. -Ross. 0. https://code.google.com/p/linked-data-api/wiki/Specification On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle li...@kcoyle.net wrote: Ross, I agree with your statement that data doesn't have to be RDF all the way down, etc. But I'd like to hear more about why you think SPARQL availability has less value, and if you see an alternative to SPARQL for querying. kc On 11/6/13 8:11 AM, Ross Singer wrote: Hugh, I don't think you're in the weeds with your question (and, while I think that named graphs can provide a solution to your particular problem, that doesn't necessarily mean that it doesn't raise more questions or potentially more frustrations down the line - like any new power, it can be used for good or evil and the difference might not be obvious at first). My question for you, however, is why are you using a triple store for this? That is, why bother with the broad and general model in what I assume is a closed world assumption in your application? We don't generally use XML databases (Marklogic being a notable exception), or MARC databases, or insert your transmission format of choice-specific databases because usually transmission formats are designed to account for lots and lots of variations and maximum flexibility, which generally is the opposite of the modeling that goes into a specific app. I think there's a world of difference between modeling your data so it can be represented in RDF (and, possibly, available via SPARQL, but I think there is *far* less value there) and committing to RDF all the way down. RDF is a generalization so multiple parties can agree on what data means, but I would have a hard time swallowing the argument that domain-specific data must be RDF-native. -Ross. On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com wrote: Does that work right down to the level of the individual triple though? If a large percentage of my triples are each in their own individual graphs, won't that be chaos? I really don't know the answer, it's not a rhetorical question! Hugh On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote: Named Graphs are the way to solve the issue you bring up in that post, in my opinion. You mint an identifier for the graph, and associate the provenance and other information with that. This then gets ingested as the
Re: [CODE4LIB] rdf serialization
Ross, I think you are not alone, as per this: http://howfuckedismydatabase.com/nosql/ kc On 11/6/13 8:54 AM, Ross Singer wrote: Hey Karen, It's purely anecdotal (albeit anecdotes borne from working at a company that offered, and has since abandoned, a sparql-based triple store service), but I just don't see the interest in arbitrary SPARQL queries against remote datasets that I do against linking to (and grabbing) known items. I think there are multiple reasons for this: 1) Unless you're already familiar with the dataset behind the SPARQL endpoint, where do you even start with constructing useful queries? 2) SPARQL as a query language is a combination of being too powerful and completely useless in practice: query timeouts are commonplace, endpoints don't support all of 1.1, etc. And, going back to point #1, it's hard to know how to optimize your queries unless you are already pretty familiar with the data 3) SPARQL is a flawed API interface from the get-go (IMHO) for the same reason we don't offer a public SQL interface to our RDBMSes Which isn't to say it doesn't have its uses or applications. I just think that in most cases domain/service-specific APIs (be they RESTful, based on the Linked Data API [0], whatever) will likely be favored over generic SPARQL endpoints. Are n+1 different APIs ideal? I am pretty sure the answer is no, but that's the future I foresee, personally. -Ross. 0. https://code.google.com/p/linked-data-api/wiki/Specification On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle li...@kcoyle.net wrote: Ross, I agree with your statement that data doesn't have to be RDF all the way down, etc. But I'd like to hear more about why you think SPARQL availability has less value, and if you see an alternative to SPARQL for querying. kc On 11/6/13 8:11 AM, Ross Singer wrote: Hugh, I don't think you're in the weeds with your question (and, while I think that named graphs can provide a solution to your particular problem, that doesn't necessarily mean that it doesn't raise more questions or potentially more frustrations down the line - like any new power, it can be used for good or evil and the difference might not be obvious at first). My question for you, however, is why are you using a triple store for this? That is, why bother with the broad and general model in what I assume is a closed world assumption in your application? We don't generally use XML databases (Marklogic being a notable exception), or MARC databases, or insert your transmission format of choice-specific databases because usually transmission formats are designed to account for lots and lots of variations and maximum flexibility, which generally is the opposite of the modeling that goes into a specific app. I think there's a world of difference between modeling your data so it can be represented in RDF (and, possibly, available via SPARQL, but I think there is *far* less value there) and committing to RDF all the way down. RDF is a generalization so multiple parties can agree on what data means, but I would have a hard time swallowing the argument that domain-specific data must be RDF-native. -Ross. On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com wrote: Does that work right down to the level of the individual triple though? If a large percentage of my triples are each in their own individual graphs, won't that be chaos? I really don't know the answer, it's not a rhetorical question! Hugh On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote: Named Graphs are the way to solve the issue you bring up in that post, in my opinion. You mint an identifier for the graph, and associate the provenance and other information with that. This then gets ingested as the 4th URI into a quad store, so you don't lose the provenance information. In JSON-LD: { @id : uri-for-graph, dcterms:creator : uri-for-hugh, @graph : [ // ... triples go here ... ] } Rob On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com wrote: I wrote about this a few months back at http://blogs.library.duke.edu/dcthree/2013/07/27/the- trouble-with-triples/ I'd be very interested to hear what the smart folks here think! Hugh On Nov 5, 2013, at 18:28 , Alexander Johannesen alexander.johanne...@gmail.com wrote: But the question to every piece of meta data is *authority*, which is the part of RDF that sucks. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] rdf serialization
Karen, The URIs you gave get me to webpages *about* the Declaration of Independence. I'm sure it's just a copy/paste mistake, but in this context you want the exact right URIs of course. And by better I guess you meant probably more widely used and probably longer lasting? :) LOC URI for the DoI (the work) is without .html: http://id.loc.gov/authorities/names/n79029194 VIAF URI for the DoI is without trailing /: http://viaf.org/viaf/179420344 Ben http://companjen.name/id/BC - me http://companjen.name/id/BC.html - about me On 05-11-13 19:03, Karen Coyle li...@kcoyle.net wrote: Eric, I found an even better URI for you for the Declaration of Independence: http://id.loc.gov/authorities/names/n79029194.html Now that could be seen as being representative of the name chosen by the LC Name Authority, but the related VIAF record, as per the VIAF definition of itself, represents the real world thing itself. That URI is: http://viaf.org/viaf/179420344/ I noticed that this VIAF URI isn't linked from the Wikipedia page, so I will add that. kc
Re: [CODE4LIB] rdf serialization
On Wed, Nov 6, 2013 at 3:47 AM, Ben Companjen ben.compan...@dans.knaw.nl wrote: The URIs you gave get me to webpages *about* the Declaration of Independence. I'm sure it's just a copy/paste mistake, but in this context you want the exact right URIs of course. And by better I guess you meant probably more widely used and probably longer lasting? :) LOC URI for the DoI (the work) is without .html: http://id.loc.gov/authorities/names/n79029194 VIAF URI for the DoI is without trailing /: http://viaf.org/viaf/179420344 Thanks for that Ben. IMHO it's (yet another) illustration of why the W3C's approach to educating the world about URIs for real world things hasn't quite caught on, while RESTful ones (promoted by the IETF) have. If someone as knowledgeable as Karen can do that, what does it say about our ability as practitioners to use URIs this way, and in our ability to write software to do it as well? In a REST world, when you get a 200 OK it doesn't mean the resource is a Web Document. The resource can be anything, you just happened to successfully get a representation of it. If you like you can provide hints about the nature of the resource in the representation, but the resource itself never goes over the wire, the representation does. It's a subtle but important difference in two ways of looking at Web architecture. If you find yourself interested in making up your own mind about this you can find the RESTful definitions of resource and representation in the IETF HTTP RFCs, most recently as of a few weeks ago in draft [1]. You can find language about Web Documents (or at least its more recent variant, Information Resource) in the W3C's Architecture of the World Wide Web [2]. Obviously I'm biased towards the IETF's position on this. This is just my personal opinion from my experience as a Web developer trying to explain Linked Data to practitioners, looking at the Web we have, and chatting with good friends who weren't afraid to tell me what they thought. //Ed [1] http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-24#page-7 [2] http://www.w3.org/TR/webarch/#id-resources
Re: [CODE4LIB] rdf serialization
Yes, I'm going to get sucked into this vi vs emacs argument for nostalgia's sake... ROTFL, because that is exactly what I was thinking. “Vi is better. No, emacs. You are both wrong; it is all about BBedit!” Each tool whether they be editors, email clients, or RDF serializations all have their own strengths and weaknesses. Like religions, none of them are perfect, but they all have some value. —ELM
Re: [CODE4LIB] rdf serialization
Ben, Yes, I copied from the browser URIs, and that was sloppy. However, it was the quickest thing to do, plus it was addressed to a human, not a machine. The URI for the LC entry is there on the page. Unfortunately, the VIAF URI is called Permalink -- which isn't obvious. I guess if I want anyone to answer my emails, I need to post mistakes. When I post correct information, my mail goes unanswered (not even a thanks). So, thanks, guys. kc On 11/6/13 12:47 AM, Ben Companjen wrote: Karen, The URIs you gave get me to webpages *about* the Declaration of Independence. I'm sure it's just a copy/paste mistake, but in this context you want the exact right URIs of course. And by better I guess you meant probably more widely used and probably longer lasting? :) LOC URI for the DoI (the work) is without .html: http://id.loc.gov/authorities/names/n79029194 VIAF URI for the DoI is without trailing /: http://viaf.org/viaf/179420344 Ben http://companjen.name/id/BC - me http://companjen.name/id/BC.html - about me On 05-11-13 19:03, Karen Coyle li...@kcoyle.net wrote: Eric, I found an even better URI for you for the Declaration of Independence: http://id.loc.gov/authorities/names/n79029194.html Now that could be seen as being representative of the name chosen by the LC Name Authority, but the related VIAF record, as per the VIAF definition of itself, represents the real world thing itself. That URI is: http://viaf.org/viaf/179420344/ I noticed that this VIAF URI isn't linked from the Wikipedia page, so I will add that. kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] rdf serialization
I could have known it was a test! ;) Thanks Karen :) On 06-11-13 15:20, Karen Coyle li...@kcoyle.net wrote: I guess if I want anyone to answer my emails, I need to post mistakes.
Re: [CODE4LIB] rdf serialization
I wrote about this a few months back at http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/ I'd be very interested to hear what the smart folks here think! Hugh On Nov 5, 2013, at 18:28 , Alexander Johannesen alexander.johanne...@gmail.com wrote: But the question to every piece of meta data is *authority*, which is the part of RDF that sucks.
Re: [CODE4LIB] rdf serialization
In the kinds of data I have to deal with, who made an assertion, or what sources provide evidence for a statement are vitally important bits of information, so its not just a data-source integration problem, where you're taking batches of triples from different sources and putting them together. It's a question of how to encode scholarly, messy, humanities data. The answer of course, might be don't use RDF for that :-). I'd rather not invent something if I don't have to though. Hugh On Nov 6, 2013, at 10:56 , Robert Sanderson azarot...@gmail.com wrote: A large number of triples that all have different provenance? I'm curious as to how you get them :) Rob On Wed, Nov 6, 2013 at 8:52 AM, Hugh Cayless philomou...@gmail.com wrote: Does that work right down to the level of the individual triple though? If a large percentage of my triples are each in their own individual graphs, won't that be chaos? I really don't know the answer, it's not a rhetorical question! Hugh On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote: Named Graphs are the way to solve the issue you bring up in that post, in my opinion. You mint an identifier for the graph, and associate the provenance and other information with that. This then gets ingested as the 4th URI into a quad store, so you don't lose the provenance information. In JSON-LD: { @id : uri-for-graph, dcterms:creator : uri-for-hugh, @graph : [ // ... triples go here ... ] } Rob On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com wrote: I wrote about this a few months back at http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/ I'd be very interested to hear what the smart folks here think! Hugh On Nov 5, 2013, at 18:28 , Alexander Johannesen alexander.johanne...@gmail.com wrote: But the question to every piece of meta data is *authority*, which is the part of RDF that sucks.
Re: [CODE4LIB] rdf serialization
Hugh, I don't think you're in the weeds with your question (and, while I think that named graphs can provide a solution to your particular problem, that doesn't necessarily mean that it doesn't raise more questions or potentially more frustrations down the line - like any new power, it can be used for good or evil and the difference might not be obvious at first). My question for you, however, is why are you using a triple store for this? That is, why bother with the broad and general model in what I assume is a closed world assumption in your application? We don't generally use XML databases (Marklogic being a notable exception), or MARC databases, or insert your transmission format of choice-specific databases because usually transmission formats are designed to account for lots and lots of variations and maximum flexibility, which generally is the opposite of the modeling that goes into a specific app. I think there's a world of difference between modeling your data so it can be represented in RDF (and, possibly, available via SPARQL, but I think there is *far* less value there) and committing to RDF all the way down. RDF is a generalization so multiple parties can agree on what data means, but I would have a hard time swallowing the argument that domain-specific data must be RDF-native. -Ross. On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com wrote: Does that work right down to the level of the individual triple though? If a large percentage of my triples are each in their own individual graphs, won't that be chaos? I really don't know the answer, it's not a rhetorical question! Hugh On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote: Named Graphs are the way to solve the issue you bring up in that post, in my opinion. You mint an identifier for the graph, and associate the provenance and other information with that. This then gets ingested as the 4th URI into a quad store, so you don't lose the provenance information. In JSON-LD: { @id : uri-for-graph, dcterms:creator : uri-for-hugh, @graph : [ // ... triples go here ... ] } Rob On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com wrote: I wrote about this a few months back at http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/ I'd be very interested to hear what the smart folks here think! Hugh On Nov 5, 2013, at 18:28 , Alexander Johannesen alexander.johanne...@gmail.com wrote: But the question to every piece of meta data is *authority*, which is the part of RDF that sucks.
Re: [CODE4LIB] rdf serialization
Ross, I agree with your statement that data doesn't have to be RDF all the way down, etc. But I'd like to hear more about why you think SPARQL availability has less value, and if you see an alternative to SPARQL for querying. kc On 11/6/13 8:11 AM, Ross Singer wrote: Hugh, I don't think you're in the weeds with your question (and, while I think that named graphs can provide a solution to your particular problem, that doesn't necessarily mean that it doesn't raise more questions or potentially more frustrations down the line - like any new power, it can be used for good or evil and the difference might not be obvious at first). My question for you, however, is why are you using a triple store for this? That is, why bother with the broad and general model in what I assume is a closed world assumption in your application? We don't generally use XML databases (Marklogic being a notable exception), or MARC databases, or insert your transmission format of choice-specific databases because usually transmission formats are designed to account for lots and lots of variations and maximum flexibility, which generally is the opposite of the modeling that goes into a specific app. I think there's a world of difference between modeling your data so it can be represented in RDF (and, possibly, available via SPARQL, but I think there is *far* less value there) and committing to RDF all the way down. RDF is a generalization so multiple parties can agree on what data means, but I would have a hard time swallowing the argument that domain-specific data must be RDF-native. -Ross. On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com wrote: Does that work right down to the level of the individual triple though? If a large percentage of my triples are each in their own individual graphs, won't that be chaos? I really don't know the answer, it's not a rhetorical question! Hugh On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote: Named Graphs are the way to solve the issue you bring up in that post, in my opinion. You mint an identifier for the graph, and associate the provenance and other information with that. This then gets ingested as the 4th URI into a quad store, so you don't lose the provenance information. In JSON-LD: { @id : uri-for-graph, dcterms:creator : uri-for-hugh, @graph : [ // ... triples go here ... ] } Rob On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com wrote: I wrote about this a few months back at http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/ I'd be very interested to hear what the smart folks here think! Hugh On Nov 5, 2013, at 18:28 , Alexander Johannesen alexander.johanne...@gmail.com wrote: But the question to every piece of meta data is *authority*, which is the part of RDF that sucks. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] rdf serialization
The answer is purely because the RDF data model and the technology around it looks like it would almost do what we need it to. I do not, and cannot, assume a closed world. The open world assumption is one of the attractive things about RDF, in fact :-) Hugh On Nov 6, 2013, at 11:11 , Ross Singer rossfsin...@gmail.com wrote: My question for you, however, is why are you using a triple store for this? That is, why bother with the broad and general model in what I assume is a closed world assumption in your application?
Re: [CODE4LIB] rdf serialization
Hey Karen, It's purely anecdotal (albeit anecdotes borne from working at a company that offered, and has since abandoned, a sparql-based triple store service), but I just don't see the interest in arbitrary SPARQL queries against remote datasets that I do against linking to (and grabbing) known items. I think there are multiple reasons for this: 1) Unless you're already familiar with the dataset behind the SPARQL endpoint, where do you even start with constructing useful queries? 2) SPARQL as a query language is a combination of being too powerful and completely useless in practice: query timeouts are commonplace, endpoints don't support all of 1.1, etc. And, going back to point #1, it's hard to know how to optimize your queries unless you are already pretty familiar with the data 3) SPARQL is a flawed API interface from the get-go (IMHO) for the same reason we don't offer a public SQL interface to our RDBMSes Which isn't to say it doesn't have its uses or applications. I just think that in most cases domain/service-specific APIs (be they RESTful, based on the Linked Data API [0], whatever) will likely be favored over generic SPARQL endpoints. Are n+1 different APIs ideal? I am pretty sure the answer is no, but that's the future I foresee, personally. -Ross. 0. https://code.google.com/p/linked-data-api/wiki/Specification On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle li...@kcoyle.net wrote: Ross, I agree with your statement that data doesn't have to be RDF all the way down, etc. But I'd like to hear more about why you think SPARQL availability has less value, and if you see an alternative to SPARQL for querying. kc On 11/6/13 8:11 AM, Ross Singer wrote: Hugh, I don't think you're in the weeds with your question (and, while I think that named graphs can provide a solution to your particular problem, that doesn't necessarily mean that it doesn't raise more questions or potentially more frustrations down the line - like any new power, it can be used for good or evil and the difference might not be obvious at first). My question for you, however, is why are you using a triple store for this? That is, why bother with the broad and general model in what I assume is a closed world assumption in your application? We don't generally use XML databases (Marklogic being a notable exception), or MARC databases, or insert your transmission format of choice-specific databases because usually transmission formats are designed to account for lots and lots of variations and maximum flexibility, which generally is the opposite of the modeling that goes into a specific app. I think there's a world of difference between modeling your data so it can be represented in RDF (and, possibly, available via SPARQL, but I think there is *far* less value there) and committing to RDF all the way down. RDF is a generalization so multiple parties can agree on what data means, but I would have a hard time swallowing the argument that domain-specific data must be RDF-native. -Ross. On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com wrote: Does that work right down to the level of the individual triple though? If a large percentage of my triples are each in their own individual graphs, won't that be chaos? I really don't know the answer, it's not a rhetorical question! Hugh On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote: Named Graphs are the way to solve the issue you bring up in that post, in my opinion. You mint an identifier for the graph, and associate the provenance and other information with that. This then gets ingested as the 4th URI into a quad store, so you don't lose the provenance information. In JSON-LD: { @id : uri-for-graph, dcterms:creator : uri-for-hugh, @graph : [ // ... triples go here ... ] } Rob On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com wrote: I wrote about this a few months back at http://blogs.library.duke.edu/dcthree/2013/07/27/the- trouble-with-triples/ I'd be very interested to hear what the smart folks here think! Hugh On Nov 5, 2013, at 18:28 , Alexander Johannesen alexander.johanne...@gmail.com wrote: But the question to every piece of meta data is *authority*, which is the part of RDF that sucks. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] rdf serialization
Hugh, I'm skeptical of this in a usable application or interface. Applications have constraints. There are predicates you care about, there are values you display in specific ways. There are expectations, based on the domain, in the data that are either driven by the interface or the needs of the consumers. I have yet to see an example of arbitrary and unexpected data exposed in an application that people actually use. -Ross. On Wed, Nov 6, 2013 at 11:39 AM, Hugh Cayless philomou...@gmail.com wrote: The answer is purely because the RDF data model and the technology around it looks like it would almost do what we need it to. I do not, and cannot, assume a closed world. The open world assumption is one of the attractive things about RDF, in fact :-) Hugh On Nov 6, 2013, at 11:11 , Ross Singer rossfsin...@gmail.com wrote: My question for you, however, is why are you using a triple store for this? That is, why bother with the broad and general model in what I assume is a closed world assumption in your application?
Re: [CODE4LIB] rdf serialization
I think that the answer to #1 is that if you want or expect people to use your endpoint that you should document how it works: the ontologies, the models, and a variety of example SPARQL queries, ranging from simple to complex. The British Museum's SPARQL endpoint ( http://collection.britishmuseum.org/sparql) is highly touted, but how many people actually use it? I understand your point about SPARQL being too complicated for an API interface, but the best examples of services built on SPARQL are probably the ones you don't even realize are built on SPARQL (e.g., http://numismatics.org/ocre/id/ric.1%282%29.aug.4A#mapTab). So on one hand, perhaps only the most dedicated and hardcore researchers will venture to construct SPARQL queries for your endpoint, but on the other, you can build some pretty visualizations based on SPARQL queries conducted in the background from the user's interaction with a simple html/javascript based interface. Ethan On Wed, Nov 6, 2013 at 11:54 AM, Ross Singer rossfsin...@gmail.com wrote: Hey Karen, It's purely anecdotal (albeit anecdotes borne from working at a company that offered, and has since abandoned, a sparql-based triple store service), but I just don't see the interest in arbitrary SPARQL queries against remote datasets that I do against linking to (and grabbing) known items. I think there are multiple reasons for this: 1) Unless you're already familiar with the dataset behind the SPARQL endpoint, where do you even start with constructing useful queries? 2) SPARQL as a query language is a combination of being too powerful and completely useless in practice: query timeouts are commonplace, endpoints don't support all of 1.1, etc. And, going back to point #1, it's hard to know how to optimize your queries unless you are already pretty familiar with the data 3) SPARQL is a flawed API interface from the get-go (IMHO) for the same reason we don't offer a public SQL interface to our RDBMSes Which isn't to say it doesn't have its uses or applications. I just think that in most cases domain/service-specific APIs (be they RESTful, based on the Linked Data API [0], whatever) will likely be favored over generic SPARQL endpoints. Are n+1 different APIs ideal? I am pretty sure the answer is no, but that's the future I foresee, personally. -Ross. 0. https://code.google.com/p/linked-data-api/wiki/Specification On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle li...@kcoyle.net wrote: Ross, I agree with your statement that data doesn't have to be RDF all the way down, etc. But I'd like to hear more about why you think SPARQL availability has less value, and if you see an alternative to SPARQL for querying. kc On 11/6/13 8:11 AM, Ross Singer wrote: Hugh, I don't think you're in the weeds with your question (and, while I think that named graphs can provide a solution to your particular problem, that doesn't necessarily mean that it doesn't raise more questions or potentially more frustrations down the line - like any new power, it can be used for good or evil and the difference might not be obvious at first). My question for you, however, is why are you using a triple store for this? That is, why bother with the broad and general model in what I assume is a closed world assumption in your application? We don't generally use XML databases (Marklogic being a notable exception), or MARC databases, or insert your transmission format of choice-specific databases because usually transmission formats are designed to account for lots and lots of variations and maximum flexibility, which generally is the opposite of the modeling that goes into a specific app. I think there's a world of difference between modeling your data so it can be represented in RDF (and, possibly, available via SPARQL, but I think there is *far* less value there) and committing to RDF all the way down. RDF is a generalization so multiple parties can agree on what data means, but I would have a hard time swallowing the argument that domain-specific data must be RDF-native. -Ross. On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com wrote: Does that work right down to the level of the individual triple though? If a large percentage of my triples are each in their own individual graphs, won't that be chaos? I really don't know the answer, it's not a rhetorical question! Hugh On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote: Named Graphs are the way to solve the issue you bring up in that post, in my opinion. You mint an identifier for the graph, and associate the provenance and other information with that. This then gets ingested as the 4th URI into a quad store, so you don't lose the provenance information. In JSON-LD: { @id : uri-for-graph, dcterms:creator : uri-for-hugh, @graph : [ // ...
Re: [CODE4LIB] rdf serialization
On Sun, Nov 3, 2013 at 3:45 PM, Eric Lease Morgan emor...@nd.edu wrote: This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using a strict syntax, specifically a strict syntax of triples. It is very difficult for humans to articulate knowledge, let alone codifying it. How realistic is the idea of the Semantic Web? I wonder this not because I don’t think the technology can handle the problem. I say this because I think people can’t (or have great difficulty) succinctly articulating knowledge. Or maybe knowledge does not fit into triples? I think you're right Eric. I don't think knowledge can be encoded completely in triples, any more than it can be encoded completely in finding aids or books. One thing that I (naively) wasn't fully aware of when I started dabbling the Semantic Web and Linked Data is how much the technology is entangled with debates about the philosophy of language. These debates play out in a variety of ways, but most notably in disagreements about the nature of a resource (httpRange-14) in Web Architecture. Shameless plug: Dorothea Salo and I tried to write about how some of this impacts the domain of the library/archive [1]. One of the strengths of RDF is its notion of a data model that is behind the various serializations (xml, ntriples, json, n3, turtle, etc). I'm with Ross though: I find it much to read rdf as turtle or json-ld than it is rdf/xml. //Ed [1] http://arxiv.org/abs/1302.4591
Re: [CODE4LIB] rdf serialization
On 11/5/13 6:45 AM, Ed Summers wrote: I'm with Ross though: ... and Karen! I find it much to read rdf as turtle or json-ld than it is rdf/xml. It's easier to read, but it's also easier to create *correctly*, and that, to me, is the key point. Folks who are used to XML have a certain notion of data organization in mind. Working with RDF in XML one tends to fall into the XML data think rather than the RDF concepts. I have suggested (repeatedly) to LC on the BIBFRAME list that they should use turtle rather than RDF/XML in their examples -- because I suspect that they may be doing some XML think in the background. This seems to be the case because in some of the BIBFRAME documents the examples are in XML but not RDF/XML. I find this rather ... disappointing. I also find it useful to create pseudo-code triples using whatever notation I find handy, as in the example I provided earlier for Eric. Writing out actual valid triples is a pain, but seeing your data as triples is very useful. kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] rdf serialization
FWIW, Here’s the W3C’s RDF Primer with examples in turtle instead of RDF/XML: http://www.w3.org/2007/02/turtle/primer/ And the turtle spec: http://www.w3.org/TR/turtle/ Aaron On Nov 5, 2013, at 10:07 AM, Karen Coyle li...@kcoyle.net wrote: On 11/5/13 6:45 AM, Ed Summers wrote: I'm with Ross though: ... and Karen! I find it much to read rdf as turtle or json-ld than it is rdf/xml. It's easier to read, but it's also easier to create *correctly*, and that, to me, is the key point. Folks who are used to XML have a certain notion of data organization in mind. Working with RDF in XML one tends to fall into the XML data think rather than the RDF concepts. I have suggested (repeatedly) to LC on the BIBFRAME list that they should use turtle rather than RDF/XML in their examples -- because I suspect that they may be doing some XML think in the background. This seems to be the case because in some of the BIBFRAME documents the examples are in XML but not RDF/XML. I find this rather ... disappointing. I also find it useful to create pseudo-code triples using whatever notation I find handy, as in the example I provided earlier for Eric. Writing out actual valid triples is a pain, but seeing your data as triples is very useful. kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] rdf serialization
On Tue, Nov 5, 2013 at 10:07 AM, Karen Coyle li...@kcoyle.net wrote: I have suggested (repeatedly) to LC on the BIBFRAME list that they should use turtle rather than RDF/XML in their examples -- because I suspect that they may be doing some XML think in the background. This seems to be the case because in some of the BIBFRAME documents the examples are in XML but not RDF/XML. I find this rather ... disappointing. I think you'll find that many people and organizations are much more familiar with xml and its data model than they are with rdf. Sometimes when people with a strong background in xml come to rdf they naturally want to keep thinking in terms of xml. This is possible up to a point, but it eventually hampers understanding. //Ed
Re: [CODE4LIB] rdf serialization
Ed -- thanks for the link -- you and Dorothy have written a tremendously clear and useful piece Sheila -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ed Summers Sent: Tuesday, November 05, 2013 9:45 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] rdf serialization On Sun, Nov 3, 2013 at 3:45 PM, Eric Lease Morgan emor...@nd.edu wrote: This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using a strict syntax, specifically a strict syntax of triples. It is very difficult for humans to articulate knowledge, let alone codifying it. How realistic is the idea of the Semantic Web? I wonder this not because I don't think the technology can handle the problem. I say this because I think people can't (or have great difficulty) succinctly articulating knowledge. Or maybe knowledge does not fit into triples? I think you're right Eric. I don't think knowledge can be encoded completely in triples, any more than it can be encoded completely in finding aids or books. One thing that I (naively) wasn't fully aware of when I started dabbling the Semantic Web and Linked Data is how much the technology is entangled with debates about the philosophy of language. These debates play out in a variety of ways, but most notably in disagreements about the nature of a resource (httpRange-14) in Web Architecture. Shameless plug: Dorothea Salo and I tried to write about how some of this impacts the domain of the library/archive [1]. One of the strengths of RDF is its notion of a data model that is behind the various serializations (xml, ntriples, json, n3, turtle, etc). I'm with Ross though: I find it much to read rdf as turtle or json-ld than it is rdf/xml. //Ed [1] http://arxiv.org/abs/1302.4591
Re: [CODE4LIB] rdf serialization
Eric, I found an even better URI for you for the Declaration of Independence: http://id.loc.gov/authorities/names/n79029194.html Now that could be seen as being representative of the name chosen by the LC Name Authority, but the related VIAF record, as per the VIAF definition of itself, represents the real world thing itself. That URI is: http://viaf.org/viaf/179420344/ I noticed that this VIAF URI isn't linked from the Wikipedia page, so I will add that. kc On 11/2/13 9:00 PM, Eric Lease Morgan wrote: How can I write an RDF serialization enabling me to express the fact that the United States Declaration Of Independence was written by Thomas Jefferson and Thomas Jefferson was a male? (And thus asserting that the Declaration of Independence was written by a male.) Suppose I have the following assertion: rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:dc=http://purl.org/dc/elements/1.1/; !-- the Declaration Of Independence was authored by Thomas Jefferson -- rdf:Description rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html; dc:creatorhttp://www.worldcat.org/identities/lccn-n79-89957/dc:creator /rdf:Description /rdf:RDF Suppose I have a second assertion: rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:foaf=http://xmlns.com/foaf/0.1/; !-- Thomas Jefferson was a male -- rdf:Description rdf:about=http://www.worldcat.org/identities/lccn-n79-89957; foaf:gendermale/foaf:gender /rdf:Description /rdf:RDF Now suppose a cool Linked Data robot came along and harvested my RDF/XML. Moreover lets assume the robot could make the logical conclusion that the Declaration was written by a male. How might the robot express this fact in RDF/XML? The following is my first attempt at such an expression, but the resulting graph (attached) doesn't seem to visually express what I really want: rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:foaf=http://xmlns.com/foaf/0.1/“ xmlns:dc=http://purl.org/dc/elements/1.1/“ rdf:Description rdf:about=http://www.worldcat.org/identities/lccn-n79-89957; foaf:gendermale/foaf:gender /rdf:Description rdf:Description rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html; dc:creatorhttp://www.worldcat.org/identities/lccn-n79-89957/dc:creator /rdf:Description /rdf:RDF Am I doing something wrong? How might you encode such the following expression — The Declaration Of Independence was authored by Thomas Jefferson, and Thomas Jefferson was a male. And therefore, the Declaration Of Independence was authored by a male named Thomas Jefferson? Maybe RDF can not express this fact because it requires two predicates in a single expression, and this the expression would not be a triple but rather a “quadrile — object, predicate #1, subject/object, predicate #2, and subject? — Eric Morgan [cid:2A12C96F-E5C4-4C77-999C-B7FF5C2FA171@att.net] -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] rdf serialization
On Tue, Nov 5, 2013 at 9:45 AM, Ed Summers e...@pobox.com wrote: On Sun, Nov 3, 2013 at 3:45 PM, Eric Lease Morgan emor...@nd.edu wrote: This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using a strict syntax, specifically a strict syntax of triples. It is very difficult for humans to articulate knowledge, let alone codifying it. How realistic is the idea of the Semantic Web? I wonder this not because I don’t think the technology can handle the problem. I say this because I think people can’t (or have great difficulty) succinctly articulating knowledge. Or maybe knowledge does not fit into triples? I think you're right Eric. I don't think knowledge can be encoded completely in triples, any more than it can be encoded completely in finding aids or books. Or... anything, honestly. We're humans. Our understanding and perception of the universe changes daily. I don't think it's unreasonable to accept that any description of the universe, input by a human, will reflect the fundamental reality that was was encoded might be wrong. I don't really buy the argument that RDF is somehow less capable of succinctly articulating knowledge compared to anything else. All models are wrong. Some are useful. One thing that I (naively) wasn't fully aware of when I started dabbling the Semantic Web and Linked Data is how much the technology is entangled with debates about the philosophy of language. These debates play out in a variety of ways, but most notably in disagreements about the nature of a resource (httpRange-14) in Web Architecture. Shameless plug: Dorothea Salo and I tried to write about how some of this impacts the domain of the library/archive [1]. OTOH, schema.org doesn't concern itself at all with this dichotomy (information vs. non-information resource) and I think that most (sane, pragmatic) practitioners would consider that linked data, as well. Given the fact that schema.org is so easily mapped to RDF, I think this argument is going to be so polluted (if it isn't already) that it will eventually have to evolve to a far less academic position. One of the strengths of RDF is its notion of a data model that is behind the various serializations (xml, ntriples, json, n3, turtle, etc). I'm with Ross though: I find it much to read rdf as turtle or json-ld than it is rdf/xml. This is definitely where RDF outclasses almost every alternative*, because each serialization (besides RDF/XML) works extremely well for specific purposes: Turtle is great for writing RDF (either to humans or computers) and being able to understand what is being modeled. n-triples/quads is great for sharing data in bulk. json-ld is ideal for API responses, since the consumer doesn't have to know anything about RDF to have a useful data object, but if they do, all the better. -Ross. * Unless you're writing a parser, then having a kajillion serializations seriously sucks.
Re: [CODE4LIB] rdf serialization
Ross Singer rossfsin...@gmail.com wrote: This is definitely where RDF outclasses almost every alternative*, because each serialization (besides RDF/XML) works extremely well for specific purposes [...] Hmm. That depends on what you mean by alternative to RDF serialisation. I can think of a few, amongst them obviously (for me) is Topic Maps which don't go down the evil triplet way with conversion back and to an underlying data model. Having said that, there's tuples of many kinds, it's only that the triplet is the most used under the W3C banner. Many are using to a more expressive quad, a few crazies , for example, even though that may or may not be a better way of dealing with it. In the end, it all comes down to some variation over frames theory (or bundles); a serialisation of key/value pairs with some ontological denotation for what the semantics of that might be. It's hard to express what we perceive as knowledge in any notational form. The models and languages we propose are far inferior to what is needed for a world as complex as it is. But as you quoted George Box, some models are more useful than others. My personal experience is that I've got a hatred for RDF and triplets for many of the same reasons Eric touch on, and as many know, I prefer the more direct meta model of Topic Maps. However, these two different serialisation and meta model frameworks are - lo and behold! - compatible; there's canonical lossless conversion between the two. So the argument at this point comes down to personal taste for what makes more sense to you. As to more on problems of RDF, read this excellent (but slighlt dated) Bray article; http://www.tbray.org/ongoing/When/200x/2003/05/21/RDFNet But wait, there's more! We haven't touched upon the next layer of the cake; OWL, which is, more or less, an ontology for dealing with all things knowledge and web. And it kinda puzzles me that it is not more often mentioned (or used) in the systems we make. A lot of OWL was tailored towards being a better language for expressing knowledge (which in itself comes from DAML and OIL ontologies), and then there's RDFs, and OWL in various formats, and then ... Complexity. The problem, as far as I see it, is that there's not enough expression and rigor for the things we want to talk about in RDF, but we don't want to complicate things with OWL or RDFs either. And then there's that tedious distinction between a web resource and something that represents the thing in reality that RDF skipped (and hacked a 304 solution to). It's all a bit messy. * Unless you're writing a parser, then having a kajillion serializations seriously sucks. Some of us do. And yes, it sucks. I wonder about non-political solutions ever being possible again ... Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps http://shelter.nu/blog | google.com/+AlexanderJohannesen | http://xsiteable.org http://www.linkedin.com/in/shelterit
Re: [CODE4LIB] rdf serialization
Yes, I'm going to get sucked into this vi vs emacs argument for nostalgia's sake. From the linked, very outdated article: In fact, as far as I know I've never used an RDF application, nor do I know of any that make me want to use them. So what's wrong with this picture? a) Nothing. You would never know if you've used a CORBA application either. Or (insert infrastructure technology here) application. b) You've never been to the BBC website? You've never used anything that pulls in content from remote sites? Oh wait, see (a). c) I've never used a Topic Maps application. (and see (a)) I find most existing RDF/XML entirely unreadable Patient: Doctor, Doctor it hurts when I use RDF/XML! Doctor: Don't Do That Then. (aka #DDTT) Already covered in this thread. I'm a strong proponent of JSON-LD. I think that when we start to bring on board metadata-rich knowledge monuments such as WorldCat ... See VIAF in this thread. See, if you must, BIBFRAME in this thread. There /are/ challenges with RDF, not going to argue against that. And in fact I /have/ recently argued for it: http://www.cni.org/news/video-rdf-failures-linked-data-letdowns/ But for the vast majority of cases, the problems are solved (JSON-LD) or no one cares any more (httpRange14). Named Graphs (those quads used by crazies you refer to) solve the remaining issues, but aren't standard yet. They are, however, cleverly baked into JSON-LD for the time that they are. On Tue, Nov 5, 2013 at 2:48 PM, Alexander Johannesen alexander.johanne...@gmail.com wrote: Ross Singer rossfsin...@gmail.com wrote: This is definitely where RDF outclasses almost every alternative*, Having said that, there's tuples of many kinds, it's only that the triplet is the most used under the W3C banner. Many are using to a more expressive quad, a few crazies , for example, even though that ad hominem? really? Your argument ceased to be valid right about here. may or may not be a better way of dealing with it. In the end, it all comes down to some variation over frames theory (or bundles); a serialisation of key/value pairs with some ontological denotation for what the semantics of that might be. Except that RDF follows the web architecture through the use of URIs for everything. That is not to be under-estimated in terms of scalability and long term usage. But wait, there's more! We haven't touched upon the next layer of the cake; OWL, which is, more or less, an ontology for dealing with all things knowledge and web. And it kinda puzzles me that it is not more often mentioned (or used) in the systems we make. A lot of OWL was tailored towards being a better language for expressing knowledge (which in itself comes from DAML and OIL ontologies), and then there's RDFs, and OWL in various formats, and then ... Your point? You don't like an ontology? #DDTT Complexity. The problem, as far as I see it, is that there's not enough expression and rigor for the things we want to talk about in RDF, but we don't want to complicate things with OWL or RDFs either. That's no more a problem of RDF than any other system. And then there's that tedious distinction between a web resource and something that represents the thing in reality that RDF skipped (and hacked a 304 solution to). It's all a bit messy. That RDF skipped? No, *RDF* didn't skip it nor did RDF propose the *303* solution. You can use URIs to identify anything. The 303/httprange14 issue is what happens when you *dereference* a URI that identifies something that does not have a digital representation because it's a real world object. It has a direct impact on RDF, but came from the TAG not the RDF WG. http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14 And it's not messy, it's very clean. What it is not, is pragmatic. URIs are like kittens ... practically free to get, but then you have a kitten to look after and that costs money. Thus doubling up your URIs is increasing the number of kittens you have. [though likely not, in practice, doubling the cost] * Unless you're writing a parser, then having a kajillion serializations seriously sucks. Some of us do. And yes, it sucks. I wonder about non-political solutions ever being possible again ... This I agree with. Rob
Re: [CODE4LIB] rdf serialization
Hi, Robert Sanderson azarot...@gmail.com wrote: c) I've never used a Topic Maps application. (and see (a)) How do you know? There /are/ challenges with RDF [...] But for the vast majority of cases, the problems are solved (JSON-LD) or no one cares any more (httpRange14). What are you trying to say here? That httpRange14 somehow solves some issue, and we no longer need to worry about it? Having said that, there's tuples of many kinds, it's only that the triplet is the most used under the W3C banner. Many are using to a more expressive quad, a few crazies , for example, even though that ad hominem? really? Your argument ceased to be valid right about here. I think you're a touch sensitive, mate. Crazies as in, few and knowledgeable (most RDF users these days don't know what tuples are, and how they fit into the representation of data) but not mainstream. I'm one of those crazies. It was meant in jest. may or may not be a better way of dealing with it. In the end, it all comes down to some variation over frames theory (or bundles); a serialisation of key/value pairs with some ontological denotation for what the semantics of that might be. Except that RDF follows the web architecture through the use of URIs for everything. That is not to be under-estimated in terms of scalability and long term usage. So does Topic Maps. Not sure I get your point? This is just semantics of the key dominator in tuple serialisation, there's nothing revolutionary about that, it's just an ontological commitment used by systems. URIs don't give you some magic advantage; they're still a string of characters as far as representation is concerned, and I dare say, this points out the flaw in httpRange14 right there; in order to know representation you need to resolve the identifier, ie. there's a movable dynamic part to what in most cases needs to be static. Not saying I have the answer, mind you, but there are some fundamental problems with knowledge representation in RDF that a lot of people don't care about which I do feel people of a library bent should care about. But wait, there's more! [big snip] Your point? You don't like an ontology? #DDTT My point was the very first words in the following paragraph; Complexity. And of course I like ontologies. I've bandied them around these parts for the last 10 years or so, and I'm very happy with RDA/FRBR directions of late, taking at least RDF/Linked Data seriously. I'm thus not convinced you understood what I wrote, and if nothing else, my bad. I'll try again. That's no more a problem of RDF than any other system. Yes, it is. RDF is promoted as a solution to a big problem of findable and shareable meta data, however until you understand and use the full RDF cake, you're scratching the surface and doing things sloppy (and I'd argue, badly). The whole idea of strict ontologies is rigor, consistency and better means of normalising the meta data so we all can use it to represent the same things we're talking about. But the question to every piece of meta data is *authority*, which is the part of RDF that sucks. Currently it's all balanced on WikiPedia and dbPedia, which isn't a bad thing all in itself, but neither of those two are static nor authoritative in the same way, say, a global library organisation might be. With RDF, people are slowly being trained to accept all manners of crap meta data, and we as librarians should not be so eager to accept that. We can say what we like about the current library tools and models (and, of course, we do; they're not perfect), but there's a whole missing chunk of what makes RDF 'work' that is, well, sub-par for *knowledge representation*. And that's our game, no? The shorter version; the RDF cake with it myriad of layers and standards are too complex for most people to get right, so Linked Data comes along and try to be simpler by making the long goal harder to achieve. I'm not, however, *against* RDF. But I am for pointing out that RDF is neither easy to work with, nor ideal for any long-term goals we might have in knowledge representation. RDF could have been made a lot better which has better solutions upstream, but most of this RDF talk is stuck in 1.0 territory, suffering the sins of former versions. And then there's that tedious distinction between a web resource and something that represents the thing in reality that RDF skipped (and hacked a 304 solution to). It's all a bit messy. That RDF skipped? No, *RDF* didn't skip it nor did RDF propose the *303* solution. You can use URIs to identify anything. I think my point was that since representation is so important to any goal you have for RDF (and the rest of the stack) it was a mistake to not get it right *first*. OWL has better means of dealing with it, but then, complexity, yadda, yadda. http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14 And it's not messy, it's very clean. Subjective, of course. Have you ever played with an
Re: [CODE4LIB] rdf serialization
And yet for the last 50 years they've been creating MARC? For the last 20, they've been making EAD, TEI, etc? As with any of these, there is an expectation that end users will not be hand rolling machine readable serializations, but inputting into interfaces. That is not to say there aren't headaches with RDF (there is no assumption of order of triples, for example), but associating properties with entity in which they actually belong, I would argue, is its real strength. -Ross. On Nov 3, 2013 10:30 PM, Eric Lease Morgan emor...@nd.edu wrote: On Nov 3, 2013, at 6:07 PM, Robert Sanderson azarot...@gmail.com wrote: And it's not very hard given the right mindset -- its just a fully expanded relational database, where the identifiers are URIs. Yes, it's not 1st year computer science, but it is 2nd or 3rd year rather than post graduate. Okay, granted, but how many people do we know who can draw an entity relationship diagram? In other words, how many people can represent knowledge as a relational database? Very few people in Library Land are able to get past flat files, let alone relational databases. Yet we are hoping to build the Semantic Web where everybody can contribute. I think this is a challenge. Don’t get me wrong. I think this is a good thing to give a whirl, but I think it is hard. — ELM
Re: [CODE4LIB] rdf serialization
Eric, I can't help but think that part of your problem is that you're using RDF/XML, which definitely makes it harder to understand and visualize the data model. It might help if you switched to an RDF native serialization, like Turtle, which definitely helps with regards to seeing RDF. -Ross. On Nov 4, 2013 6:29 AM, Ross Singer rossfsin...@gmail.com wrote: And yet for the last 50 years they've been creating MARC? For the last 20, they've been making EAD, TEI, etc? As with any of these, there is an expectation that end users will not be hand rolling machine readable serializations, but inputting into interfaces. That is not to say there aren't headaches with RDF (there is no assumption of order of triples, for example), but associating properties with entity in which they actually belong, I would argue, is its real strength. -Ross. On Nov 3, 2013 10:30 PM, Eric Lease Morgan emor...@nd.edu wrote: On Nov 3, 2013, at 6:07 PM, Robert Sanderson azarot...@gmail.com wrote: And it's not very hard given the right mindset -- its just a fully expanded relational database, where the identifiers are URIs. Yes, it's not 1st year computer science, but it is 2nd or 3rd year rather than post graduate. Okay, granted, but how many people do we know who can draw an entity relationship diagram? In other words, how many people can represent knowledge as a relational database? Very few people in Library Land are able to get past flat files, let alone relational databases. Yet we are hoping to build the Semantic Web where everybody can contribute. I think this is a challenge. Don’t get me wrong. I think this is a good thing to give a whirl, but I think it is hard. — ELM
Re: [CODE4LIB] rdf serialization
I am of two minds when it comes to Linked Data and the Semantic Web. Libraries and many other professions have been encoding things for a long time, but encoding the description of a book (MARC) or marking up texts (TEI), is not the same as encoding knowledge — a goal of the Semantic Web. The former is a process of enhancing — the adding of metadata — to an existing object. The later is a process of making assertions of truth. And in the case of the former, look at all the variations of describing a book, and think of all the different ways a person can mark up a text. We can’t agree. In general, people do not think very systematically nor very logically. We are humans full of ambiguity, feelings, and perceptions. We are more animal than we are computer. We are more heart than we are mind. We are more like Leonard McCoy and less like Spock. Listen to people talk. Quite frequently we do not speak in complete sentences, and complete “sentences” are at the heart of the Linked Data and the Semantic Web. Think how much we rely on body language to convey ideas. If we — as a whole — have this difficulty, then how can we expect to capture and encode data, information, and knowledge with the rigor that a computer requires, no matter how many front-ends and layers are inserted between us and the triples? Don’t get me wrong. I am of two minds when it comes to Linked Data and the Semantic Web. On one hand I believe the technology (think triples) is a descent fit and reasonable way to represent data, information, and knowledge. Heck I’m writing a book on the subject with examples of how to accomplish this goal. I am sincerely not threatened by this technology, nor do any of the RDF serializations get in my way. On the other hand, I just as sincerely wonder if the majority of people can manifest the rigor required by truly stupid and unforgiving computers to articulate knowledge. — Eric “Spoken Like A Humanist And Less Like A Computer Scientist” Morgan University of Notre Dame
Re: [CODE4LIB] rdf serialization
On 11/3/13 12:45 PM, Eric Lease Morgan wrote: Cool input. Thank you. I believe I have tweaked my assertions: 1. The Declaration of Independence was written by Thomas Jefferson rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:dc=http://purl.org/dc/elements/1.1/; rdf:Description rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html; dc:creatorhttp://id.loc.gov/authorities/names/n79089957/dc:creator /rdf:Description /rdf:RDF To refer to the DoI itself rather than a web page you can use either a wikipedia or dbpedia URI: http://en.wikipedia.org/wiki/Declaration_of_Independence Also, as has been mentioned, it would be best to use dcterms rather than dc elements, since the assumption with dcterms is that the value is an identifier rather than a string. So you need: http://purl.org/dc/terms/ which is either expressed as dct or dcterms The dc/1.1/ has in a sense been upgraded by dc/terms/ but I recently did a study of actual usage of Dublin Core in linked data and in fact both are heavily used, although dcterms is by far the most common usage due to its compatibility with RDF. http://kcoyle.blogspot.com/2013/10/dublin-core-usage-in-lod.html http://kcoyle.blogspot.com/2013/10/who-uses-dublin-core-dcterms.html http://kcoyle.blogspot.com/2013/10/who-uses-dublin-core-original-15.html 2. Thomas Jefferson is a male person rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:foaf=http://xmlns.com/foaf/0.1/; rdf:Description rdf:about=http://id.loc.gov/authorities/names/n7908995; foaf:Person foaf:gender=male / /rdf:Description /rdf:RDF Using no additional vocabularies (ontologies), I think my hypothetical Linked Data spider / robot ought to be able to assert the following: 3. The Declaration of Independence was written by Thomas Jefferson, a male person rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:dc=http://purl.org/dc/elements/1.1/; xmlns:foaf=http://xmlns.com/foaf/0.1/; rdf:Description rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html; dc:creator foaf:Person rdf:about=http://id.loc.gov/authorities/names/n79089957; foaf:gendermale/foaf:gender /foaf:Person /dc:creator /rdf:Description /rdf:RDF The W3C Validator…validates Assertion #3, and returns the attached graph, which illustrates the logical combination of Assertion #1 and #2. This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using a strict syntax, specifically a strict syntax of triples. It is very difficult for humans to articulate knowledge, let alone codifying it. How realistic is the idea of the Semantic Web? I wonder this not because I don’t think the technology can handle the problem. I say this because I think people can’t (or have great difficulty) succinctly articulating knowledge. Or maybe knowledge does not fit into triples? I agree that it is hard, although it gets easier as you lose some of your current data processing baggage and begin to think more in terms of triples. For that, like Ross, I really advise you not to do your work in RDF/XML -- in a sense RDF/XML is a kluge to force RDF into XML, and it is much more complex than RDF in turtle or plain triples. I also agree that not all knowledge may fit nicely into triples. RDF is great for articulations of things and relationships. Your example here is a perfect one for RDF. In fact, it is very simple conceptually and could be quite simple as triples. Conceptually you are saying: URI:DoI - dct:creator - URI:TJeff URI:Tjeff - RDF:type - foaf:Person URI:Tjeff - foaf:gender - male !-- I bet we can find a URI for male/female/? -- I've experimented a bit with using iPython (with Notebook) and the python rdflib, which can create a virtual triple-store that you can query against: http://www.rdflib.net/ Again, it's all soo much easier if you don't use rdfxml. kc — Eric Morgan University of Notre Dame [cid:6A4E613F-CE41-4D35-BDFA-2E66EE7AF20A] -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] rdf serialization
+1. kc On 11/4/13 3:40 AM, Ross Singer wrote: Eric, I can't help but think that part of your problem is that you're using RDF/XML, which definitely makes it harder to understand and visualize the data model. It might help if you switched to an RDF native serialization, like Turtle, which definitely helps with regards to seeing RDF. -Ross. On Nov 4, 2013 6:29 AM, Ross Singer rossfsin...@gmail.com wrote: And yet for the last 50 years they've been creating MARC? For the last 20, they've been making EAD, TEI, etc? As with any of these, there is an expectation that end users will not be hand rolling machine readable serializations, but inputting into interfaces. That is not to say there aren't headaches with RDF (there is no assumption of order of triples, for example), but associating properties with entity in which they actually belong, I would argue, is its real strength. -Ross. On Nov 3, 2013 10:30 PM, Eric Lease Morgan emor...@nd.edu wrote: On Nov 3, 2013, at 6:07 PM, Robert Sanderson azarot...@gmail.com wrote: And it's not very hard given the right mindset -- its just a fully expanded relational database, where the identifiers are URIs. Yes, it's not 1st year computer science, but it is 2nd or 3rd year rather than post graduate. Okay, granted, but how many people do we know who can draw an entity relationship diagram? In other words, how many people can represent knowledge as a relational database? Very few people in Library Land are able to get past flat files, let alone relational databases. Yet we are hoping to build the Semantic Web where everybody can contribute. I think this is a challenge. Don’t get me wrong. I think this is a good thing to give a whirl, but I think it is hard. — ELM -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] rdf serialization
+1! Well said, Karen. I would add (to further abuse your metaphor) that it’s also possible to make a delicious dish with simple ingredients. With minimal knowledge, most non-computer science-y folks can cook up some structured data in RDF, maybe encoded in RDFa and deliver it on the same HTML pages they are already presenting to the public, and add a surprising large amount of value to the information they publish. I do completely agree that there’s some intellectual work necessary to do this effectively but the same is certainly true about metadata creation. In fact, I would say that those with library backgrounds are well suited to shape and present knowledge for machine processing. Finally, the same principles of publishing information on the human-readable Web apply to the structured data Web. Anyone can say anything about anything, it’s just up to us to figure out whether that information is meaningful or accurate. The more we build trusted sources by publishing and shaping that information with standards, best practices, and transparency, the more effective the future Web will be. Aaron On Nov 4, 2013, at 9:59 AM, Karen Coyle li...@kcoyle.net wrote: Eric, I really don't see how RDF or linked data is any more difficult to grasp than a database design -- and database design is a tool used by developers to create information systems for people who will never have to think about database design. Imagine the rigor that goes into the creation of the app Angry Birds and imagine how many users are even aware of the calculation of trajectories, speed, and the inter-relations between things on the screen that will fall or explode or whatever. A master chef understands the chemistry of his famous dessert - the rest of us just eat and enjoy. kc On 11/4/13 6:40 AM, Eric Lease Morgan wrote: I am of two minds when it comes to Linked Data and the Semantic Web. Libraries and many other professions have been encoding things for a long time, but encoding the description of a book (MARC) or marking up texts (TEI), is not the same as encoding knowledge — a goal of the Semantic Web. The former is a process of enhancing — the adding of metadata — to an existing object. The later is a process of making assertions of truth. And in the case of the former, look at all the variations of describing a book, and think of all the different ways a person can mark up a text. We can’t agree. In general, people do not think very systematically nor very logically. We are humans full of ambiguity, feelings, and perceptions. We are more animal than we are computer. We are more heart than we are mind. We are more like Leonard McCoy and less like Spock. Listen to people talk. Quite frequently we do not speak in complete sentences, and complete “sentences” are at the heart of the Linked Data and the Semantic Web. Think how much we rely on body language to convey ideas. If we — as a whole — have this difficulty, then how can we expect to capture and encode data, information, and knowledge with the rigor that a computer requires, no matter how many front-ends and layers are inserted between us and the triples? Don’t get me wrong. I am of two minds when it comes to Linked Data and the Semantic Web. On one hand I believe the technology (think triples) is a descent fit and reasonable way to represent data, information, and knowledge. Heck I’m writing a book on the subject with examples of how to accomplish this goal. I am sincerely not threatened by this technology, nor do any of the RDF serializations get in my way. On the other hand, I just as sincerely wonder if the majority of people can manifest the rigor required by truly stupid and unforgiving computers to articulate knowledge. — Eric “Spoken Like A Humanist And Less Like A Computer Scientist” Morgan University of Notre Dame -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] rdf serialization
In general, people do not think very systematically nor very logically. We are humans full of ambiguity, feelings, and perceptions If we — as a whole — have this difficulty, then how can we expect to capture and encode data, information, and knowledge with the rigor that a computer requires... Life is analog and context dependent so the hopeless inconsistency normally found in metadata outside controlled environments should be expected. Given how difficult it is to get good metadata from people for things they know well and care about a great deal (how many people do you know who don't have trouble managing personal photos and important files?), I wouldn't hold my breath that there will be much useful human generated metadata anytime soon. Despite their problems, heuristics strike me a better way to go in the long term. kyle
Re: [CODE4LIB] rdf serialization
Hiya, On Tue, Nov 5, 2013 at 1:59 AM, Karen Coyle li...@kcoyle.net wrote: Eric, I really don't see how RDF or linked data is any more difficult to grasp than a database design Well, there's at least one thing which makes people tilt; the flexible structures for semantics (or, ontologies) in where things aren't as solid as in a data model. A framework where there are endless options (on the surface of it) for relationships between things is daunting to people who come from a world where the options are cast in iron. There's also a shift away from thing's identities being tied down in a model somewhere into a world where identities are a bit more, hmm, flexible? And less rigid? That can make some people cringe, as well. A master chef understands the chemistry of his famous dessert - the rest of us just eat and enjoy. Hmm. Some of us will try to make that dessert again, for sure. :) Alex
Re: [CODE4LIB] rdf serialization
Hi Eric, Complex ideas that span multiple triples are often expressed through SPARQL. In other words, you store a soup of triple statements and the SPARQL query traverses the triples and presents the resulting information in a variety of formats, much in the same way you’d query a database using JOINs and present the resulting data on a single Web page. Using your graph, this SPARQL query should return the work and the gender of the work's creator: PREFIX dc: http://purl.org/dc/terms/ PREFIX foaf: http://xmlns.com/foaf/0.1/ SELECT ?work ?gender WHERE { ?work dc:created ?creator . ?creator foaf:gender ?gender . } If you want to explicitly state that the Declaration of Independence was written by a male, you would need a predicate that’s set up to do that, something that takes a work as its domain and has a range of a gender. It would also help to have a class for gender. That way, you could have a triple statement like this: http://www.worldcat.org/identities/lccn-n79-89957 foaf:name “Thomas Jefferson” a :Male . and you could infer that if: http://www.archives.gov/exhibits/charters/declaration_transcript.html dc:creator http://www.worldcat.org/identities/lccn-n79-89957 . The creator of the Declaration is of class :Male: http://www.archives.gov/exhibits/charters/declaration_transcript.html :createdByGender :Male All the best, Aaron Rubinstein On Nov 3, 2013, at 12:00 AM, Eric Lease Morgan emor...@nd.edu wrote: How can I write an RDF serialization enabling me to express the fact that the United States Declaration Of Independence was written by Thomas Jefferson and Thomas Jefferson was a male? (And thus asserting that the Declaration of Independence was written by a male.) Suppose I have the following assertion: rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:dc=http://purl.org/dc/elements/1.1/; !-- the Declaration Of Independence was authored by Thomas Jefferson -- rdf:Description rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html; dc:creatorhttp://www.worldcat.org/identities/lccn-n79-89957/dc:creator /rdf:Description /rdf:RDF Suppose I have a second assertion: rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:foaf=http://xmlns.com/foaf/0.1/; !-- Thomas Jefferson was a male -- rdf:Description rdf:about=http://www.worldcat.org/identities/lccn-n79-89957; foaf:gendermale/foaf:gender /rdf:Description /rdf:RDF Now suppose a cool Linked Data robot came along and harvested my RDF/XML. Moreover lets assume the robot could make the logical conclusion that the Declaration was written by a male. How might the robot express this fact in RDF/XML? The following is my first attempt at such an expression, but the resulting graph (attached) doesn't seem to visually express what I really want: rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:foaf=http://xmlns.com/foaf/0.1/“ xmlns:dc=http://purl.org/dc/elements/1.1/“ rdf:Description rdf:about=http://www.worldcat.org/identities/lccn-n79-89957; foaf:gendermale/foaf:gender /rdf:Description rdf:Description rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html; dc:creatorhttp://www.worldcat.org/identities/lccn-n79-89957/dc:creator /rdf:Description /rdf:RDF Am I doing something wrong? How might you encode such the following expression — The Declaration Of Independence was authored by Thomas Jefferson, and Thomas Jefferson was a male. And therefore, the Declaration Of Independence was authored by a male named Thomas Jefferson? Maybe RDF can not express this fact because it requires two predicates in a single expression, and this the expression would not be a triple but rather a “quadrile — object, predicate #1, subject/object, predicate #2, and subject? — Eric Morgan [cid:2A12C96F-E5C4-4C77-999C-B7FF5C2FA171@att.net]
Re: [CODE4LIB] rdf serialization
Cool input. Thank you. I believe I have tweaked my assertions: 1. The Declaration of Independence was written by Thomas Jefferson rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:dc=http://purl.org/dc/elements/1.1/; rdf:Description rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html; dc:creatorhttp://id.loc.gov/authorities/names/n79089957/dc:creator /rdf:Description /rdf:RDF 2. Thomas Jefferson is a male person rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:foaf=http://xmlns.com/foaf/0.1/; rdf:Description rdf:about=http://id.loc.gov/authorities/names/n7908995; foaf:Person foaf:gender=male / /rdf:Description /rdf:RDF Using no additional vocabularies (ontologies), I think my hypothetical Linked Data spider / robot ought to be able to assert the following: 3. The Declaration of Independence was written by Thomas Jefferson, a male person rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:dc=http://purl.org/dc/elements/1.1/; xmlns:foaf=http://xmlns.com/foaf/0.1/; rdf:Description rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html; dc:creator foaf:Person rdf:about=http://id.loc.gov/authorities/names/n79089957; foaf:gendermale/foaf:gender /foaf:Person /dc:creator /rdf:Description /rdf:RDF The W3C Validator…validates Assertion #3, and returns the attached graph, which illustrates the logical combination of Assertion #1 and #2. This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using a strict syntax, specifically a strict syntax of triples. It is very difficult for humans to articulate knowledge, let alone codifying it. How realistic is the idea of the Semantic Web? I wonder this not because I don’t think the technology can handle the problem. I say this because I think people can’t (or have great difficulty) succinctly articulating knowledge. Or maybe knowledge does not fit into triples? — Eric Morgan University of Notre Dame [cid:6A4E613F-CE41-4D35-BDFA-2E66EE7AF20A] inline: graphic.png
Re: [CODE4LIB] rdf serialization
You're still missing a vital step. Currently your assertion is that the creator /of a web page/ is Jefferson, which is clearly false. The page (...) is a transcription of the Declaration of Independence. The Declaration of Independence is written by Jefferson. Jefferson is Male. And it's not very hard given the right mindset -- its just a fully expanded relational database, where the identifiers are URIs. Yes, it's not 1st year computer science, but it is 2nd or 3rd year rather than post graduate. Which is not to say that people do not have great trouble succinctly articulating knowledge, but like any skill, it can be learned. Just look at the variation in the ways of writing papers ... some people can do it very clearly, some have much more difficulty. And with JSON-LD, you don't have to understand the RDF, just a clean representation of it. Rob On Sun, Nov 3, 2013 at 1:45 PM, Eric Lease Morgan emor...@nd.edu wrote: Cool input. Thank you. I believe I have tweaked my assertions: 1. The Declaration of Independence was written by Thomas Jefferson rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:dc=http://purl.org/dc/elements/1.1/; rdf:Description rdf:about= http://www.archives.gov/exhibits/charters/declaration_transcript.html; dc:creatorhttp://id.loc.gov/authorities/names/n79089957/dc:creator /rdf:Description /rdf:RDF 2. Thomas Jefferson is a male person rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:foaf=http://xmlns.com/foaf/0.1/; rdf:Description rdf:about=http://id.loc.gov/authorities/names/n7908995 foaf:Person foaf:gender=male / /rdf:Description /rdf:RDF Using no additional vocabularies (ontologies), I think my hypothetical Linked Data spider / robot ought to be able to assert the following: 3. The Declaration of Independence was written by Thomas Jefferson, a male person rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:dc=http://purl.org/dc/elements/1.1/; xmlns:foaf=http://xmlns.com/foaf/0.1/; rdf:Description rdf:about= http://www.archives.gov/exhibits/charters/declaration_transcript.html; dc:creator foaf:Person rdf:about= http://id.loc.gov/authorities/names/n79089957; foaf:gendermale/foaf:gender /foaf:Person /dc:creator /rdf:Description /rdf:RDF The W3C Validator…validates Assertion #3, and returns the attached graph, which illustrates the logical combination of Assertion #1 and #2. This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using a strict syntax, specifically a strict syntax of triples. It is very difficult for humans to articulate knowledge, let alone codifying it. How realistic is the idea of the Semantic Web? I wonder this not because I don’t think the technology can handle the problem. I say this because I think people can’t (or have great difficulty) succinctly articulating knowledge. Or maybe knowledge does not fit into triples? — Eric Morgan University of Notre Dame [cid:6A4E613F-CE41-4D35-BDFA-2E66EE7AF20A]
Re: [CODE4LIB] rdf serialization
On Nov 3, 2013, at 6:07 PM, Robert Sanderson azarot...@gmail.com wrote: Currently your assertion is that the creator /of a web page/ is Jefferson, which is clearly false. The page (...) is a transcription of the Declaration of Independence. The Declaration of Independence is written by Jefferson. Jefferson is Male. Okay. ‘Makes sense, but let’s find a URI for THE Declaration Of Independence — that thing under glass in the National Archives. —ELM
Re: [CODE4LIB] rdf serialization
On Nov 3, 2013, at 6:07 PM, Robert Sanderson azarot...@gmail.com wrote: And it's not very hard given the right mindset -- its just a fully expanded relational database, where the identifiers are URIs. Yes, it's not 1st year computer science, but it is 2nd or 3rd year rather than post graduate. Okay, granted, but how many people do we know who can draw an entity relationship diagram? In other words, how many people can represent knowledge as a relational database? Very few people in Library Land are able to get past flat files, let alone relational databases. Yet we are hoping to build the Semantic Web where everybody can contribute. I think this is a challenge. Don’t get me wrong. I think this is a good thing to give a whirl, but I think it is hard. — ELM