Re: [CODE4LIB] rdf serialization

2013-11-07 Thread Karen Coyle

Ethan, thanks, it's good to have examples.

I'd say that for simple linking SPARQL may not be necessary, perhaps 
should be avoided, but IF you need something ELSE, say a query WHERE you 
have conditions, THEN you may find that a query language is needed.


kc

On 11/6/13 9:14 AM, Ethan Gruber wrote:

I think that the answer to #1 is that if you want or expect people to use
your endpoint that you should document how it works: the ontologies, the
models, and a variety of example SPARQL queries, ranging from simple to
complex.  The British Museum's SPARQL endpoint (
http://collection.britishmuseum.org/sparql) is highly touted, but how many
people actually use it?  I understand your point about SPARQL being too
complicated for an API interface, but the best examples of services built
on SPARQL are probably the ones you don't even realize are built on SPARQL
(e.g., http://numismatics.org/ocre/id/ric.1%282%29.aug.4A#mapTab).  So on
one hand, perhaps only the most dedicated and hardcore researchers will
venture to construct SPARQL queries for your endpoint, but on the other,
you can build some pretty visualizations based on SPARQL queries conducted
in the background from the user's interaction with a simple html/javascript
based interface.

Ethan


On Wed, Nov 6, 2013 at 11:54 AM, Ross Singer rossfsin...@gmail.com wrote:


Hey Karen,

It's purely anecdotal (albeit anecdotes borne from working at a company
that offered, and has since abandoned, a sparql-based triple store
service), but I just don't see the interest in arbitrary SPARQL queries
against remote datasets that I do against linking to (and grabbing) known
items.  I think there are multiple reasons for this:

1) Unless you're already familiar with the dataset behind the SPARQL
endpoint, where do you even start with constructing useful queries?
2) SPARQL as a query language is a combination of being too powerful and
completely useless in practice: query timeouts are commonplace, endpoints
don't support all of 1.1, etc.  And, going back to point #1, it's hard to
know how to optimize your queries unless you are already pretty familiar
with the data
3) SPARQL is a flawed API interface from the get-go (IMHO) for the same
reason we don't offer a public SQL interface to our RDBMSes

Which isn't to say it doesn't have its uses or applications.

I just think that in most cases domain/service-specific APIs (be they
RESTful, based on the Linked Data API [0], whatever) will likely be favored
over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
sure the answer is no, but that's the future I foresee, personally.

-Ross.
0. https://code.google.com/p/linked-data-api/wiki/Specification


On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle li...@kcoyle.net wrote:


Ross, I agree with your statement that data doesn't have to be RDF all
the way down, etc. But I'd like to hear more about why you think SPARQL
availability has less value, and if you see an alternative to SPARQL for
querying.

kc



On 11/6/13 8:11 AM, Ross Singer wrote:


Hugh, I don't think you're in the weeds with your question (and, while I
think that named graphs can provide a solution to your particular

problem,

that doesn't necessarily mean that it doesn't raise more questions or
potentially more frustrations down the line - like any new power, it can
be
used for good or evil and the difference might not be obvious at first).

My question for you, however, is why are you using a triple store for
this?
   That is, why bother with the broad and general model in what I assume
is a
closed world assumption in your application?

We don't generally use XML databases (Marklogic being a notable
exception),
or MARC databases, or insert your transmission format of

choice-specific

databases because usually transmission formats are designed to account

for

lots and lots of variations and maximum flexibility, which generally is
the
opposite of the modeling that goes into a specific app.

I think there's a world of difference between modeling your data so it

can

be represented in RDF (and, possibly, available via SPARQL, but I think
there is *far* less value there) and committing to RDF all the way down.
   RDF is a generalization so multiple parties can agree on what data
means,
but I would have a hard time swallowing the argument that

domain-specific

data must be RDF-native.

-Ross.


On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com
wrote:

  Does that work right down to the level of the individual triple though?

If
a large percentage of my triples are each in their own individual

graphs,

won't that be chaos? I really don't know the answer, it's not a
rhetorical
question!

Hugh

On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com

wrote:

  Named Graphs are the way to solve the issue you bring up in that post,

in
my opinion.  You mint an identifier for the graph, and associate the
provenance and other information with that.  This then gets ingested

as

the

Re: [CODE4LIB] rdf serialization

2013-11-07 Thread Karen Coyle

Ross, I think you are not alone, as per this:

http://howfuckedismydatabase.com/nosql/

kc

On 11/6/13 8:54 AM, Ross Singer wrote:

Hey Karen,

It's purely anecdotal (albeit anecdotes borne from working at a company
that offered, and has since abandoned, a sparql-based triple store
service), but I just don't see the interest in arbitrary SPARQL queries
against remote datasets that I do against linking to (and grabbing) known
items.  I think there are multiple reasons for this:

1) Unless you're already familiar with the dataset behind the SPARQL
endpoint, where do you even start with constructing useful queries?
2) SPARQL as a query language is a combination of being too powerful and
completely useless in practice: query timeouts are commonplace, endpoints
don't support all of 1.1, etc.  And, going back to point #1, it's hard to
know how to optimize your queries unless you are already pretty familiar
with the data
3) SPARQL is a flawed API interface from the get-go (IMHO) for the same
reason we don't offer a public SQL interface to our RDBMSes

Which isn't to say it doesn't have its uses or applications.

I just think that in most cases domain/service-specific APIs (be they
RESTful, based on the Linked Data API [0], whatever) will likely be favored
over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
sure the answer is no, but that's the future I foresee, personally.

-Ross.
0. https://code.google.com/p/linked-data-api/wiki/Specification


On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle li...@kcoyle.net wrote:


Ross, I agree with your statement that data doesn't have to be RDF all
the way down, etc. But I'd like to hear more about why you think SPARQL
availability has less value, and if you see an alternative to SPARQL for
querying.

kc



On 11/6/13 8:11 AM, Ross Singer wrote:


Hugh, I don't think you're in the weeds with your question (and, while I
think that named graphs can provide a solution to your particular problem,
that doesn't necessarily mean that it doesn't raise more questions or
potentially more frustrations down the line - like any new power, it can
be
used for good or evil and the difference might not be obvious at first).

My question for you, however, is why are you using a triple store for
this?
   That is, why bother with the broad and general model in what I assume
is a
closed world assumption in your application?

We don't generally use XML databases (Marklogic being a notable
exception),
or MARC databases, or insert your transmission format of choice-specific
databases because usually transmission formats are designed to account for
lots and lots of variations and maximum flexibility, which generally is
the
opposite of the modeling that goes into a specific app.

I think there's a world of difference between modeling your data so it can
be represented in RDF (and, possibly, available via SPARQL, but I think
there is *far* less value there) and committing to RDF all the way down.
   RDF is a generalization so multiple parties can agree on what data
means,
but I would have a hard time swallowing the argument that domain-specific
data must be RDF-native.

-Ross.


On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com
wrote:

  Does that work right down to the level of the individual triple though?

If
a large percentage of my triples are each in their own individual graphs,
won't that be chaos? I really don't know the answer, it's not a
rhetorical
question!

Hugh

On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote:

  Named Graphs are the way to solve the issue you bring up in that post,

in
my opinion.  You mint an identifier for the graph, and associate the
provenance and other information with that.  This then gets ingested as


the


4th URI into a quad store, so you don't lose the provenance information.

In JSON-LD:
{
   @id : uri-for-graph,
   dcterms:creator : uri-for-hugh,
   @graph : [
// ... triples go here ...
   ]
}

Rob



On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com


wrote:


I wrote about this a few months back at

  http://blogs.library.duke.edu/dcthree/2013/07/27/the-

trouble-with-triples/


I'd be very interested to hear what the smart folks here think!

Hugh

On Nov 5, 2013, at 18:28 , Alexander Johannesen 
alexander.johanne...@gmail.com wrote:

  But the

question to every piece of meta data is *authority*, which is the part
of RDF that sucks.


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet



--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ben Companjen
Karen,

The URIs you gave get me to webpages *about* the Declaration of
Independence. I'm sure it's just a copy/paste mistake, but in this context
you want the exact right URIs of course. And by better I guess you meant
probably more widely used and probably longer lasting? :)

LOC URI for the DoI (the work) is without .html:
http://id.loc.gov/authorities/names/n79029194


VIAF URI for the DoI is without trailing /:
http://viaf.org/viaf/179420344

Ben
http://companjen.name/id/BC - me
http://companjen.name/id/BC.html - about me


On 05-11-13 19:03, Karen Coyle li...@kcoyle.net wrote:

Eric, I found an even better URI for you for the Declaration of
Independence:

http://id.loc.gov/authorities/names/n79029194.html

Now that could be seen as being representative of the name chosen by the
LC Name Authority, but the related VIAF record, as per the VIAF
definition of itself, represents the real world thing itself. That URI is:

http://viaf.org/viaf/179420344/

I noticed that this VIAF URI isn't linked from the Wikipedia page, so I
will add that.

kc


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ed Summers
On Wed, Nov 6, 2013 at 3:47 AM, Ben Companjen
ben.compan...@dans.knaw.nl wrote:
 The URIs you gave get me to webpages *about* the Declaration of
 Independence. I'm sure it's just a copy/paste mistake, but in this context
 you want the exact right URIs of course. And by better I guess you meant
 probably more widely used and probably longer lasting? :)

 LOC URI for the DoI (the work) is without .html:
 http://id.loc.gov/authorities/names/n79029194

 VIAF URI for the DoI is without trailing /:
 http://viaf.org/viaf/179420344

Thanks for that Ben. IMHO it's (yet another) illustration of why the
W3C's approach to educating the world about URIs for real world things
hasn't quite caught on, while RESTful ones (promoted by the IETF)
have. If someone as knowledgeable as Karen can do that, what does it
say about our ability as practitioners to use URIs this way, and in
our ability to write software to do it as well?

In a REST world, when you get a 200 OK it doesn't mean the resource is
a Web Document. The resource can be anything, you just happened to
successfully get a representation of it. If you like you can provide
hints about the nature of the resource in the representation, but the
resource itself never goes over the wire, the representation does.
It's a subtle but important difference in two ways of looking at Web
architecture.

If you find yourself interested in making up your own mind about this
you can find the RESTful definitions of resource and representation in
the IETF HTTP RFCs, most recently as of a few weeks ago in draft [1].
You can find language about Web Documents (or at least its more recent
variant, Information Resource) in the W3C's Architecture of the World
Wide Web [2].

Obviously I'm biased towards the IETF's position on this. This is just
my personal opinion from my experience as a Web developer trying to
explain Linked Data to practitioners, looking at the Web we have, and
chatting with good friends who weren't afraid to tell me what they
thought.

//Ed

[1] http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-24#page-7
[2] http://www.w3.org/TR/webarch/#id-resources


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Eric Lease Morgan
 Yes, I'm going to get sucked into this vi vs emacs argument for nostalgia's
 sake...

ROTFL, because that is exactly what I was thinking. “Vi is better. No, emacs. 
You are both wrong; it is all about BBedit!” Each tool whether they be editors, 
email clients, or RDF serializations all have their own strengths and 
weaknesses. Like religions, none of them are perfect, but they all have some 
value. —ELM


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Karen Coyle
Ben, Yes, I copied from the browser URIs, and that was sloppy. However, 
it was the quickest thing to do, plus it was addressed to a human, not a 
machine. The URI for the LC entry is there on the page. Unfortunately, 
the VIAF URI is called Permalink -- which isn't obvious.


I guess if I want anyone to answer my emails, I need to post mistakes. 
When I post correct information, my mail goes unanswered (not even a 
thanks). So, thanks, guys.


kc

On 11/6/13 12:47 AM, Ben Companjen wrote:

Karen,

The URIs you gave get me to webpages *about* the Declaration of
Independence. I'm sure it's just a copy/paste mistake, but in this context
you want the exact right URIs of course. And by better I guess you meant
probably more widely used and probably longer lasting? :)

LOC URI for the DoI (the work) is without .html:
http://id.loc.gov/authorities/names/n79029194


VIAF URI for the DoI is without trailing /:
http://viaf.org/viaf/179420344

Ben
http://companjen.name/id/BC - me
http://companjen.name/id/BC.html - about me


On 05-11-13 19:03, Karen Coyle li...@kcoyle.net wrote:


Eric, I found an even better URI for you for the Declaration of
Independence:

http://id.loc.gov/authorities/names/n79029194.html

Now that could be seen as being representative of the name chosen by the
LC Name Authority, but the related VIAF record, as per the VIAF
definition of itself, represents the real world thing itself. That URI is:

http://viaf.org/viaf/179420344/

I noticed that this VIAF URI isn't linked from the Wikipedia page, so I
will add that.

kc


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ben Companjen
I could have known it was a test! ;)

Thanks Karen :)

On 06-11-13 15:20, Karen Coyle li...@kcoyle.net wrote:

I guess if I want anyone to answer my emails, I need to post mistakes.


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
I wrote about this a few months back at 
http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/

I'd be very interested to hear what the smart folks here think!

Hugh

On Nov 5, 2013, at 18:28 , Alexander Johannesen 
alexander.johanne...@gmail.com wrote:

 But the
 question to every piece of meta data is *authority*, which is the part
 of RDF that sucks.


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
In the kinds of data I have to deal with, who made an assertion, or what 
sources provide evidence for a statement are vitally important bits of 
information, so its not just a data-source integration problem, where you're 
taking batches of triples from different sources and putting them together. 
It's a question of how to encode scholarly, messy, humanities data.

The answer of course, might be don't use RDF for that :-). I'd rather not 
invent something if I don't have to though.

Hugh

On Nov 6, 2013, at 10:56 , Robert Sanderson azarot...@gmail.com wrote:

 A large number of triples that all have different provenance? I'm curious
 as to how you get them :)
 
 Rob
 
 
 On Wed, Nov 6, 2013 at 8:52 AM, Hugh Cayless philomou...@gmail.com wrote:
 
 Does that work right down to the level of the individual triple though? If
 a large percentage of my triples are each in their own individual graphs,
 won't that be chaos? I really don't know the answer, it's not a rhetorical
 question!
 
 Hugh
 
 On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote:
 
 Named Graphs are the way to solve the issue you bring up in that post, in
 my opinion.  You mint an identifier for the graph, and associate the
 provenance and other information with that.  This then gets ingested as
 the
 4th URI into a quad store, so you don't lose the provenance information.
 
 In JSON-LD:
 {
 @id : uri-for-graph,
 dcterms:creator : uri-for-hugh,
 @graph : [
  // ... triples go here ...
 ]
 }
 
 Rob
 
 
 
 On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com
 wrote:
 
 I wrote about this a few months back at
 
 http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
 
 I'd be very interested to hear what the smart folks here think!
 
 Hugh
 
 On Nov 5, 2013, at 18:28 , Alexander Johannesen 
 alexander.johanne...@gmail.com wrote:
 
 But the
 question to every piece of meta data is *authority*, which is the part
 of RDF that sucks.
 
 


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ross Singer
Hugh, I don't think you're in the weeds with your question (and, while I
think that named graphs can provide a solution to your particular problem,
that doesn't necessarily mean that it doesn't raise more questions or
potentially more frustrations down the line - like any new power, it can be
used for good or evil and the difference might not be obvious at first).

My question for you, however, is why are you using a triple store for this?
 That is, why bother with the broad and general model in what I assume is a
closed world assumption in your application?

We don't generally use XML databases (Marklogic being a notable exception),
or MARC databases, or insert your transmission format of choice-specific
databases because usually transmission formats are designed to account for
lots and lots of variations and maximum flexibility, which generally is the
opposite of the modeling that goes into a specific app.

I think there's a world of difference between modeling your data so it can
be represented in RDF (and, possibly, available via SPARQL, but I think
there is *far* less value there) and committing to RDF all the way down.
 RDF is a generalization so multiple parties can agree on what data means,
but I would have a hard time swallowing the argument that domain-specific
data must be RDF-native.

-Ross.


On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com wrote:

 Does that work right down to the level of the individual triple though? If
 a large percentage of my triples are each in their own individual graphs,
 won't that be chaos? I really don't know the answer, it's not a rhetorical
 question!

 Hugh

 On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote:

  Named Graphs are the way to solve the issue you bring up in that post, in
  my opinion.  You mint an identifier for the graph, and associate the
  provenance and other information with that.  This then gets ingested as
 the
  4th URI into a quad store, so you don't lose the provenance information.
 
  In JSON-LD:
  {
   @id : uri-for-graph,
   dcterms:creator : uri-for-hugh,
   @graph : [
// ... triples go here ...
   ]
  }
 
  Rob
 
 
 
  On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com
 wrote:
 
  I wrote about this a few months back at
 
 http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
 
  I'd be very interested to hear what the smart folks here think!
 
  Hugh
 
  On Nov 5, 2013, at 18:28 , Alexander Johannesen 
  alexander.johanne...@gmail.com wrote:
 
  But the
  question to every piece of meta data is *authority*, which is the part
  of RDF that sucks.
 



Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Karen Coyle
Ross, I agree with your statement that data doesn't have to be RDF all 
the way down, etc. But I'd like to hear more about why you think SPARQL 
availability has less value, and if you see an alternative to SPARQL for 
querying.


kc


On 11/6/13 8:11 AM, Ross Singer wrote:

Hugh, I don't think you're in the weeds with your question (and, while I
think that named graphs can provide a solution to your particular problem,
that doesn't necessarily mean that it doesn't raise more questions or
potentially more frustrations down the line - like any new power, it can be
used for good or evil and the difference might not be obvious at first).

My question for you, however, is why are you using a triple store for this?
  That is, why bother with the broad and general model in what I assume is a
closed world assumption in your application?

We don't generally use XML databases (Marklogic being a notable exception),
or MARC databases, or insert your transmission format of choice-specific
databases because usually transmission formats are designed to account for
lots and lots of variations and maximum flexibility, which generally is the
opposite of the modeling that goes into a specific app.

I think there's a world of difference between modeling your data so it can
be represented in RDF (and, possibly, available via SPARQL, but I think
there is *far* less value there) and committing to RDF all the way down.
  RDF is a generalization so multiple parties can agree on what data means,
but I would have a hard time swallowing the argument that domain-specific
data must be RDF-native.

-Ross.


On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com wrote:


Does that work right down to the level of the individual triple though? If
a large percentage of my triples are each in their own individual graphs,
won't that be chaos? I really don't know the answer, it's not a rhetorical
question!

Hugh

On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote:


Named Graphs are the way to solve the issue you bring up in that post, in
my opinion.  You mint an identifier for the graph, and associate the
provenance and other information with that.  This then gets ingested as

the

4th URI into a quad store, so you don't lose the provenance information.

In JSON-LD:
{
  @id : uri-for-graph,
  dcterms:creator : uri-for-hugh,
  @graph : [
   // ... triples go here ...
  ]
}

Rob



On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com

wrote:

I wrote about this a few months back at


http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/

I'd be very interested to hear what the smart folks here think!

Hugh

On Nov 5, 2013, at 18:28 , Alexander Johannesen 
alexander.johanne...@gmail.com wrote:


But the
question to every piece of meta data is *authority*, which is the part
of RDF that sucks.


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
The answer is purely because the RDF data model and the technology around it 
looks like it would almost do what we need it to.

I do not, and cannot, assume a closed world. The open world assumption is one 
of the attractive things about RDF, in fact :-)

Hugh

On Nov 6, 2013, at 11:11 , Ross Singer rossfsin...@gmail.com wrote:

 My question for you, however, is why are you using a triple store for this?
 That is, why bother with the broad and general model in what I assume is a
 closed world assumption in your application?


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ross Singer
Hey Karen,

It's purely anecdotal (albeit anecdotes borne from working at a company
that offered, and has since abandoned, a sparql-based triple store
service), but I just don't see the interest in arbitrary SPARQL queries
against remote datasets that I do against linking to (and grabbing) known
items.  I think there are multiple reasons for this:

1) Unless you're already familiar with the dataset behind the SPARQL
endpoint, where do you even start with constructing useful queries?
2) SPARQL as a query language is a combination of being too powerful and
completely useless in practice: query timeouts are commonplace, endpoints
don't support all of 1.1, etc.  And, going back to point #1, it's hard to
know how to optimize your queries unless you are already pretty familiar
with the data
3) SPARQL is a flawed API interface from the get-go (IMHO) for the same
reason we don't offer a public SQL interface to our RDBMSes

Which isn't to say it doesn't have its uses or applications.

I just think that in most cases domain/service-specific APIs (be they
RESTful, based on the Linked Data API [0], whatever) will likely be favored
over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
sure the answer is no, but that's the future I foresee, personally.

-Ross.
0. https://code.google.com/p/linked-data-api/wiki/Specification


On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle li...@kcoyle.net wrote:

 Ross, I agree with your statement that data doesn't have to be RDF all
 the way down, etc. But I'd like to hear more about why you think SPARQL
 availability has less value, and if you see an alternative to SPARQL for
 querying.

 kc



 On 11/6/13 8:11 AM, Ross Singer wrote:

 Hugh, I don't think you're in the weeds with your question (and, while I
 think that named graphs can provide a solution to your particular problem,
 that doesn't necessarily mean that it doesn't raise more questions or
 potentially more frustrations down the line - like any new power, it can
 be
 used for good or evil and the difference might not be obvious at first).

 My question for you, however, is why are you using a triple store for
 this?
   That is, why bother with the broad and general model in what I assume
 is a
 closed world assumption in your application?

 We don't generally use XML databases (Marklogic being a notable
 exception),
 or MARC databases, or insert your transmission format of choice-specific
 databases because usually transmission formats are designed to account for
 lots and lots of variations and maximum flexibility, which generally is
 the
 opposite of the modeling that goes into a specific app.

 I think there's a world of difference between modeling your data so it can
 be represented in RDF (and, possibly, available via SPARQL, but I think
 there is *far* less value there) and committing to RDF all the way down.
   RDF is a generalization so multiple parties can agree on what data
 means,
 but I would have a hard time swallowing the argument that domain-specific
 data must be RDF-native.

 -Ross.


 On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com
 wrote:

  Does that work right down to the level of the individual triple though?
 If
 a large percentage of my triples are each in their own individual graphs,
 won't that be chaos? I really don't know the answer, it's not a
 rhetorical
 question!

 Hugh

 On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote:

  Named Graphs are the way to solve the issue you bring up in that post,
 in
 my opinion.  You mint an identifier for the graph, and associate the
 provenance and other information with that.  This then gets ingested as

 the

 4th URI into a quad store, so you don't lose the provenance information.

 In JSON-LD:
 {
   @id : uri-for-graph,
   dcterms:creator : uri-for-hugh,
   @graph : [
// ... triples go here ...
   ]
 }

 Rob



 On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com

 wrote:

 I wrote about this a few months back at

  http://blogs.library.duke.edu/dcthree/2013/07/27/the-
 trouble-with-triples/

 I'd be very interested to hear what the smart folks here think!

 Hugh

 On Nov 5, 2013, at 18:28 , Alexander Johannesen 
 alexander.johanne...@gmail.com wrote:

  But the
 question to every piece of meta data is *authority*, which is the part
 of RDF that sucks.


 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: 1-510-435-8234
 skype: kcoylenet



Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ross Singer
Hugh, I'm skeptical of this in a usable application or interface.

Applications have constraints.  There are predicates you care about, there
are values you display in specific ways.  There are expectations, based on
the domain, in the data that are either driven by the interface or the
needs of the consumers.

I have yet to see an example of arbitrary and unexpected data exposed in
an application that people actually use.

-Ross.


On Wed, Nov 6, 2013 at 11:39 AM, Hugh Cayless philomou...@gmail.com wrote:

 The answer is purely because the RDF data model and the technology around
 it looks like it would almost do what we need it to.

 I do not, and cannot, assume a closed world. The open world assumption is
 one of the attractive things about RDF, in fact :-)

 Hugh

 On Nov 6, 2013, at 11:11 , Ross Singer rossfsin...@gmail.com wrote:

  My question for you, however, is why are you using a triple store for
 this?
  That is, why bother with the broad and general model in what I assume is
 a
  closed world assumption in your application?



Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ethan Gruber
I think that the answer to #1 is that if you want or expect people to use
your endpoint that you should document how it works: the ontologies, the
models, and a variety of example SPARQL queries, ranging from simple to
complex.  The British Museum's SPARQL endpoint (
http://collection.britishmuseum.org/sparql) is highly touted, but how many
people actually use it?  I understand your point about SPARQL being too
complicated for an API interface, but the best examples of services built
on SPARQL are probably the ones you don't even realize are built on SPARQL
(e.g., http://numismatics.org/ocre/id/ric.1%282%29.aug.4A#mapTab).  So on
one hand, perhaps only the most dedicated and hardcore researchers will
venture to construct SPARQL queries for your endpoint, but on the other,
you can build some pretty visualizations based on SPARQL queries conducted
in the background from the user's interaction with a simple html/javascript
based interface.

Ethan


On Wed, Nov 6, 2013 at 11:54 AM, Ross Singer rossfsin...@gmail.com wrote:

 Hey Karen,

 It's purely anecdotal (albeit anecdotes borne from working at a company
 that offered, and has since abandoned, a sparql-based triple store
 service), but I just don't see the interest in arbitrary SPARQL queries
 against remote datasets that I do against linking to (and grabbing) known
 items.  I think there are multiple reasons for this:

 1) Unless you're already familiar with the dataset behind the SPARQL
 endpoint, where do you even start with constructing useful queries?
 2) SPARQL as a query language is a combination of being too powerful and
 completely useless in practice: query timeouts are commonplace, endpoints
 don't support all of 1.1, etc.  And, going back to point #1, it's hard to
 know how to optimize your queries unless you are already pretty familiar
 with the data
 3) SPARQL is a flawed API interface from the get-go (IMHO) for the same
 reason we don't offer a public SQL interface to our RDBMSes

 Which isn't to say it doesn't have its uses or applications.

 I just think that in most cases domain/service-specific APIs (be they
 RESTful, based on the Linked Data API [0], whatever) will likely be favored
 over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
 sure the answer is no, but that's the future I foresee, personally.

 -Ross.
 0. https://code.google.com/p/linked-data-api/wiki/Specification


 On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle li...@kcoyle.net wrote:

  Ross, I agree with your statement that data doesn't have to be RDF all
  the way down, etc. But I'd like to hear more about why you think SPARQL
  availability has less value, and if you see an alternative to SPARQL for
  querying.
 
  kc
 
 
 
  On 11/6/13 8:11 AM, Ross Singer wrote:
 
  Hugh, I don't think you're in the weeds with your question (and, while I
  think that named graphs can provide a solution to your particular
 problem,
  that doesn't necessarily mean that it doesn't raise more questions or
  potentially more frustrations down the line - like any new power, it can
  be
  used for good or evil and the difference might not be obvious at first).
 
  My question for you, however, is why are you using a triple store for
  this?
That is, why bother with the broad and general model in what I assume
  is a
  closed world assumption in your application?
 
  We don't generally use XML databases (Marklogic being a notable
  exception),
  or MARC databases, or insert your transmission format of
 choice-specific
  databases because usually transmission formats are designed to account
 for
  lots and lots of variations and maximum flexibility, which generally is
  the
  opposite of the modeling that goes into a specific app.
 
  I think there's a world of difference between modeling your data so it
 can
  be represented in RDF (and, possibly, available via SPARQL, but I think
  there is *far* less value there) and committing to RDF all the way down.
RDF is a generalization so multiple parties can agree on what data
  means,
  but I would have a hard time swallowing the argument that
 domain-specific
  data must be RDF-native.
 
  -Ross.
 
 
  On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com
  wrote:
 
   Does that work right down to the level of the individual triple though?
  If
  a large percentage of my triples are each in their own individual
 graphs,
  won't that be chaos? I really don't know the answer, it's not a
  rhetorical
  question!
 
  Hugh
 
  On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com
 wrote:
 
   Named Graphs are the way to solve the issue you bring up in that post,
  in
  my opinion.  You mint an identifier for the graph, and associate the
  provenance and other information with that.  This then gets ingested
 as
 
  the
 
  4th URI into a quad store, so you don't lose the provenance
 information.
 
  In JSON-LD:
  {
@id : uri-for-graph,
dcterms:creator : uri-for-hugh,
@graph : [
 // ... 

Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Ed Summers
On Sun, Nov 3, 2013 at 3:45 PM, Eric Lease Morgan emor...@nd.edu wrote:
 This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using 
 a strict syntax, specifically a strict syntax of triples. It is very 
 difficult for humans to articulate knowledge, let alone codifying it. How 
 realistic is the idea of the Semantic Web? I wonder this not because I don’t 
 think the technology can handle the problem. I say this because I think 
 people can’t (or have great difficulty) succinctly articulating knowledge. Or 
 maybe knowledge does not fit into triples?

I think you're right Eric. I don't think knowledge can be encoded
completely in triples, any more than it can be encoded completely in
finding aids or books.

One thing that I (naively) wasn't fully aware of when I started
dabbling the Semantic Web and Linked Data is how much the technology
is entangled with debates about the philosophy of language. These
debates play out in a variety of ways, but most notably in
disagreements about the nature of a resource (httpRange-14) in Web
Architecture. Shameless plug: Dorothea Salo and I tried to write about
how some of this impacts the domain of the library/archive [1].

One of the strengths of RDF is its notion of a data model that is
behind the various serializations (xml, ntriples, json, n3, turtle,
etc). I'm with Ross though: I find it much to read rdf as turtle or
json-ld than it is rdf/xml.

//Ed

[1] http://arxiv.org/abs/1302.4591


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Karen Coyle

On 11/5/13 6:45 AM, Ed Summers wrote:

I'm with Ross though:

... and Karen!


I find it much to read rdf as turtle or json-ld than it is rdf/xml.


It's easier to read, but it's also easier to create *correctly*, and 
that, to me, is the key point. Folks who are used to XML have a certain 
notion of data organization in mind. Working with RDF in XML one tends 
to fall into the XML data think rather than the RDF concepts.


I have suggested (repeatedly) to LC on the BIBFRAME list that they 
should use turtle rather than RDF/XML in their examples -- because I 
suspect that they may be doing some XML think in the background. This 
seems to be the case because in some of the BIBFRAME documents the 
examples are in XML but not RDF/XML. I find this rather ... disappointing.


I also find it useful to create pseudo-code triples using whatever 
notation I find handy, as in the example I provided earlier for Eric. 
Writing out actual valid triples is a pain, but seeing your data as 
triples is very useful.


kc

--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Aaron Rubinstein
FWIW, 

Here’s the W3C’s RDF Primer with examples in turtle instead of RDF/XML:
http://www.w3.org/2007/02/turtle/primer/

And the turtle spec:
http://www.w3.org/TR/turtle/

Aaron


On Nov 5, 2013, at 10:07 AM, Karen Coyle li...@kcoyle.net wrote:

 On 11/5/13 6:45 AM, Ed Summers wrote:
 I'm with Ross though:
 ... and Karen!
 
 I find it much to read rdf as turtle or json-ld than it is rdf/xml.
 
 It's easier to read, but it's also easier to create *correctly*, and that, to 
 me, is the key point. Folks who are used to XML have a certain notion of data 
 organization in mind. Working with RDF in XML one tends to fall into the XML 
 data think rather than the RDF concepts.
 
 I have suggested (repeatedly) to LC on the BIBFRAME list that they should use 
 turtle rather than RDF/XML in their examples -- because I suspect that they 
 may be doing some XML think in the background. This seems to be the case 
 because in some of the BIBFRAME documents the examples are in XML but not 
 RDF/XML. I find this rather ... disappointing.
 
 I also find it useful to create pseudo-code triples using whatever notation 
 I find handy, as in the example I provided earlier for Eric. Writing out 
 actual valid triples is a pain, but seeing your data as triples is very 
 useful.
 
 kc
 
 -- 
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: 1-510-435-8234
 skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Ed Summers
On Tue, Nov 5, 2013 at 10:07 AM, Karen Coyle li...@kcoyle.net wrote:
 I have suggested (repeatedly) to LC on the BIBFRAME list that they should
 use turtle rather than RDF/XML in their examples -- because I suspect that
 they may be doing some XML think in the background. This seems to be the
 case because in some of the BIBFRAME documents the examples are in XML but
 not RDF/XML. I find this rather ... disappointing.

I think you'll find that many people and organizations are much more
familiar with xml and its data model than they are with rdf. Sometimes
when people with a strong background in xml come to rdf they naturally
want to keep thinking in terms of xml. This is possible up to a point,
but it eventually hampers understanding.

//Ed


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Sheila M. Morrissey
Ed -- thanks for the link -- you and Dorothy have written a tremendously clear 
and useful piece
Sheila

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ed 
Summers
Sent: Tuesday, November 05, 2013 9:45 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] rdf serialization

On Sun, Nov 3, 2013 at 3:45 PM, Eric Lease Morgan emor...@nd.edu wrote:
 This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using 
 a strict syntax, specifically a strict syntax of triples. It is very 
 difficult for humans to articulate knowledge, let alone codifying it. How 
 realistic is the idea of the Semantic Web? I wonder this not because I don't 
 think the technology can handle the problem. I say this because I think 
 people can't (or have great difficulty) succinctly articulating knowledge. Or 
 maybe knowledge does not fit into triples?

I think you're right Eric. I don't think knowledge can be encoded completely in 
triples, any more than it can be encoded completely in finding aids or books.

One thing that I (naively) wasn't fully aware of when I started dabbling the 
Semantic Web and Linked Data is how much the technology is entangled with 
debates about the philosophy of language. These debates play out in a variety 
of ways, but most notably in disagreements about the nature of a resource 
(httpRange-14) in Web Architecture. Shameless plug: Dorothea Salo and I tried 
to write about how some of this impacts the domain of the library/archive [1].

One of the strengths of RDF is its notion of a data model that is behind the 
various serializations (xml, ntriples, json, n3, turtle, etc). I'm with Ross 
though: I find it much to read rdf as turtle or json-ld than it is rdf/xml.

//Ed

[1] http://arxiv.org/abs/1302.4591


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Karen Coyle
Eric, I found an even better URI for you for the Declaration of 
Independence:


http://id.loc.gov/authorities/names/n79029194.html

Now that could be seen as being representative of the name chosen by the 
LC Name Authority, but the related VIAF record, as per the VIAF 
definition of itself, represents the real world thing itself. That URI is:


http://viaf.org/viaf/179420344/

I noticed that this VIAF URI isn't linked from the Wikipedia page, so I 
will add that.


kc


On 11/2/13 9:00 PM, Eric Lease Morgan wrote:

How can I write an RDF serialization enabling me to express the fact that the 
United States Declaration Of Independence was written by Thomas Jefferson and 
Thomas Jefferson was a male? (And thus asserting that the Declaration of 
Independence was written by a male.)

Suppose I have the following assertion:

   rdf:RDF
 xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:dc=http://purl.org/dc/elements/1.1/; 

 !-- the Declaration Of Independence was authored by Thomas Jefferson --
 rdf:Description
 
rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html;
   
dc:creatorhttp://www.worldcat.org/identities/lccn-n79-89957/dc:creator
 /rdf:Description

   /rdf:RDF

Suppose I have a second assertion:

   rdf:RDF
 xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:foaf=http://xmlns.com/foaf/0.1/;

 !-- Thomas Jefferson was a male --
 rdf:Description 
rdf:about=http://www.worldcat.org/identities/lccn-n79-89957;
   foaf:gendermale/foaf:gender
 /rdf:Description

   /rdf:RDF

Now suppose a cool Linked Data robot came along and harvested my RDF/XML. 
Moreover lets assume the robot could make the logical conclusion that the 
Declaration was written by a male. How might the robot express this fact in 
RDF/XML? The following is my first attempt at such an expression, but the 
resulting graph (attached) doesn't seem to visually express what I really want:

rdf:RDF
   xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#”
   xmlns:foaf=http://xmlns.com/foaf/0.1/“
   xmlns:dc=http://purl.org/dc/elements/1.1/“

   rdf:Description 
rdf:about=http://www.worldcat.org/identities/lccn-n79-89957;
 foaf:gendermale/foaf:gender
   /rdf:Description

   rdf:Description
   
rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html;
 dc:creatorhttp://www.worldcat.org/identities/lccn-n79-89957/dc:creator
   /rdf:Description
/rdf:RDF

Am I doing something wrong? How might you encode such the following expression — The 
Declaration Of Independence was authored by Thomas Jefferson, and Thomas Jefferson 
was a male. And therefore, the Declaration Of Independence was authored by a male 
named Thomas Jefferson? Maybe RDF can not express this fact because it requires two 
predicates in a single expression, and this the expression would not be a triple but 
rather a “quadrile — object, predicate #1, subject/object, predicate #2, and 
subject?


—
Eric Morgan

[cid:2A12C96F-E5C4-4C77-999C-B7FF5C2FA171@att.net]





--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Ross Singer
On Tue, Nov 5, 2013 at 9:45 AM, Ed Summers e...@pobox.com wrote:

 On Sun, Nov 3, 2013 at 3:45 PM, Eric Lease Morgan emor...@nd.edu wrote:
  This is hard. The Semantic Web (and RDF) attempt at codifying knowledge
 using a strict syntax, specifically a strict syntax of triples. It is very
 difficult for humans to articulate knowledge, let alone codifying it. How
 realistic is the idea of the Semantic Web? I wonder this not because I
 don’t think the technology can handle the problem. I say this because I
 think people can’t (or have great difficulty) succinctly articulating
 knowledge. Or maybe knowledge does not fit into triples?

 I think you're right Eric. I don't think knowledge can be encoded
 completely in triples, any more than it can be encoded completely in
 finding aids or books.


Or... anything, honestly.  We're humans. Our understanding and perception
of the universe changes daily.  I don't think it's unreasonable to accept
that any description of the universe, input by a human, will reflect the
fundamental reality that was was encoded might be wrong.  I don't really
buy the argument that RDF is somehow less capable of succinctly
articulating knowledge compared to anything else.  All models are wrong.
 Some are useful.


 One thing that I (naively) wasn't fully aware of when I started
 dabbling the Semantic Web and Linked Data is how much the technology
 is entangled with debates about the philosophy of language. These
 debates play out in a variety of ways, but most notably in
 disagreements about the nature of a resource (httpRange-14) in Web
 Architecture. Shameless plug: Dorothea Salo and I tried to write about
 how some of this impacts the domain of the library/archive [1].

 OTOH, schema.org doesn't concern itself at all with this dichotomy
(information vs. non-information resource) and I think that most (sane,
pragmatic) practitioners would consider that linked data, as well.  Given
the fact that schema.org is so easily mapped to RDF, I think this argument
is going to be so polluted (if it isn't already) that it will eventually
have to evolve to a far less academic position.

One of the strengths of RDF is its notion of a data model that is
 behind the various serializations (xml, ntriples, json, n3, turtle,
 etc). I'm with Ross though: I find it much to read rdf as turtle or
 json-ld than it is rdf/xml.

 This is definitely where RDF outclasses almost every alternative*, because
each serialization (besides RDF/XML) works extremely well for specific
purposes:

Turtle is great for writing RDF (either to humans or computers) and being
able to understand what is being modeled.

n-triples/quads is great for sharing data in bulk.

json-ld is ideal for API responses, since the consumer doesn't have to know
anything about RDF to have a useful data object, but if they do, all the
better.

-Ross.
* Unless you're writing a parser, then having a kajillion serializations
seriously sucks.


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Alexander Johannesen
Ross Singer rossfsin...@gmail.com wrote:
 This is definitely where RDF outclasses almost every alternative*, because
 each serialization (besides RDF/XML) works extremely well for specific
 purposes [...]

Hmm. That depends on what you mean by alternative to RDF
serialisation. I can think of a few, amongst them obviously (for me)
is Topic Maps which don't go down the evil triplet way with conversion
back and to an underlying data model.

Having said that, there's tuples of many kinds, it's only that the
triplet is the most used under the W3C banner. Many are using to a
more expressive quad, a few crazies , for example, even though that
may or may not be a better way of dealing with it. In the end, it all
comes down to some variation over frames theory (or bundles); a
serialisation of key/value pairs with some ontological denotation for
what the semantics of that might be.

It's hard to express what we perceive as knowledge in any notational
form. The models and languages we propose are far inferior to what is
needed for a world as complex as it is. But as you quoted George Box,
some models are more useful than others.

My personal experience is that I've got a hatred for RDF and triplets
for many of the same reasons Eric touch on, and as many know, I prefer
the more direct meta model of Topic Maps. However, these two different
serialisation and meta model frameworks are - lo and behold! -
compatible; there's canonical lossless conversion between the two. So
the argument at this point comes down to personal taste for what makes
more sense to you.

As to more on problems of RDF, read this excellent (but slighlt dated)
Bray article;
   http://www.tbray.org/ongoing/When/200x/2003/05/21/RDFNet

But wait, there's more! We haven't touched upon the next layer of the
cake; OWL, which is, more or less, an ontology for dealing with all
things knowledge and web. And it kinda puzzles me that it is not more
often mentioned (or used) in the systems we make. A lot of OWL was
tailored towards being a better language for expressing knowledge
(which in itself comes from DAML and OIL ontologies), and then there's
RDFs, and OWL in various formats, and then ...

Complexity. The problem, as far as I see it, is that there's not
enough expression and rigor for the things we want to talk about in
RDF, but we don't want to complicate things with OWL or RDFs either.
And then there's that tedious distinction between a web resource and
something that represents the thing in reality that RDF skipped (and
hacked a 304 solution to). It's all a bit messy.

 * Unless you're writing a parser, then having a kajillion serializations
 seriously sucks.

Some of us do. And yes, it sucks. I wonder about non-political
solutions ever being possible again ...


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
 http://shelter.nu/blog  |  google.com/+AlexanderJohannesen  |
http://xsiteable.org
 http://www.linkedin.com/in/shelterit


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Robert Sanderson
Yes, I'm going to get sucked into this vi vs emacs argument for nostalgia's
sake.


From the linked, very outdated article:

 In fact, as far as I know I've never used an RDF application, nor do I
know of any that make me want to use them.  So what's wrong with this
picture?

a) Nothing.  You would never know if you've used a CORBA application
either. Or (insert infrastructure technology here) application.
b) You've never been to the BBC website? You've never used anything that
pulls in content from remote sites? Oh wait, see (a).
c) I've never used a Topic Maps application. (and see (a))

 I find most existing RDF/XML entirely unreadable
Patient: Doctor, Doctor it hurts when I use RDF/XML!
Doctor: Don't Do That Then.   (aka #DDTT)

Already covered in this thread. I'm a strong proponent of JSON-LD.

 I think that when we start to bring on board metadata-rich knowledge
monuments such as WorldCat ...

See VIAF in this thread. See, if you must, BIBFRAME in this thread.

There /are/ challenges with RDF, not going to argue against that. And in
fact I /have/ recently argued for it:
http://www.cni.org/news/video-rdf-failures-linked-data-letdowns/

But for the vast majority of cases, the problems are solved (JSON-LD) or no
one cares any more (httpRange14).  Named Graphs (those quads used by
crazies you refer to) solve the remaining issues, but aren't standard yet.
 They are, however, cleverly baked into JSON-LD for the time that they are.


On Tue, Nov 5, 2013 at 2:48 PM, Alexander Johannesen 
alexander.johanne...@gmail.com wrote:

 Ross Singer rossfsin...@gmail.com wrote:
  This is definitely where RDF outclasses almost every alternative*,

 Having said that, there's tuples of many kinds, it's only that the
 triplet is the most used under the W3C banner. Many are using to a
 more expressive quad, a few crazies , for example, even though that

ad hominem? really? Your argument ceased to be valid right about here.

 may or may not be a better way of dealing with it. In the end, it all
 comes down to some variation over frames theory (or bundles); a
 serialisation of key/value pairs with some ontological denotation for
 what the semantics of that might be.

Except that RDF follows the web architecture through the use of URIs for
everything. That is not to be under-estimated in terms of scalability and
long term usage.


 But wait, there's more! We haven't touched upon the next layer of the
 cake; OWL, which is, more or less, an ontology for dealing with all
 things knowledge and web. And it kinda puzzles me that it is not more
 often mentioned (or used) in the systems we make. A lot of OWL was
 tailored towards being a better language for expressing knowledge
 (which in itself comes from DAML and OIL ontologies), and then there's
 RDFs, and OWL in various formats, and then ...

Your point? You don't like an ontology? #DDTT


 Complexity. The problem, as far as I see it, is that there's not
 enough expression and rigor for the things we want to talk about in
 RDF, but we don't want to complicate things with OWL or RDFs either.

That's no more a problem of RDF than any other system.

 And then there's that tedious distinction between a web resource and
 something that represents the thing in reality that RDF skipped (and
 hacked a 304 solution to). It's all a bit messy.

That RDF skipped? No, *RDF* didn't skip it nor did RDF propose the *303*
solution.
You can use URIs to identify anything.

The 303/httprange14 issue is what happens when you *dereference* a URI that
identifies something that does not have a digital representation because
it's a real world object.  It has a direct impact on RDF, but came from the
TAG not the RDF WG.

http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14

And it's not messy, it's very clean. What it is not, is pragmatic. URIs are
like kittens ... practically free to get, but then you have a kitten to
look after and that costs money.  Thus doubling up your URIs is increasing
the number of kittens you have. [though likely not, in practice, doubling
the cost]

  * Unless you're writing a parser, then having a kajillion serializations
  seriously sucks.
 Some of us do. And yes, it sucks. I wonder about non-political
 solutions ever being possible again ...

This I agree with.

Rob


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Alexander Johannesen
Hi,

Robert Sanderson azarot...@gmail.com wrote:
 c) I've never used a Topic Maps application. (and see (a))

How do you know?

 There /are/ challenges with RDF [...]
 But for the vast majority of cases, the problems are solved (JSON-LD) or no
 one cares any more (httpRange14).

What are you trying to say here? That httpRange14 somehow solves some
issue, and we no longer need to worry about it?

 Having said that, there's tuples of many kinds, it's only that the
 triplet is the most used under the W3C banner. Many are using to a
 more expressive quad, a few crazies , for example, even though that

 ad hominem? really? Your argument ceased to be valid right about here.

I think you're a touch sensitive, mate. Crazies as in, few and
knowledgeable (most RDF users these days don't know what tuples are,
and how they fit into the representation of data) but not mainstream.
I'm one of those crazies. It was meant in jest.

 may or may not be a better way of dealing with it. In the end, it all
 comes down to some variation over frames theory (or bundles); a
 serialisation of key/value pairs with some ontological denotation for
 what the semantics of that might be.

 Except that RDF follows the web architecture through the use of URIs for
 everything. That is not to be under-estimated in terms of scalability and
 long term usage.

So does Topic Maps. Not sure I get your point? This is just semantics
of the key dominator in tuple serialisation, there's nothing
revolutionary about that, it's just an ontological commitment used by
systems. URIs don't give you some magic advantage; they're still a
string of characters as far as representation is concerned, and I dare
say, this points out the flaw in httpRange14 right there; in order to
know representation you need to resolve the identifier, ie. there's a
movable dynamic part to what in most cases needs to be static. Not
saying I have the answer, mind you, but there are some fundamental
problems with knowledge representation in RDF that a lot of people
don't care about which I do feel people of a library bent should
care about.

 But wait, there's more! [big snip]

 Your point? You don't like an ontology? #DDTT

My point was the very first words in the following paragraph;

 Complexity.

And of course I like ontologies. I've bandied them around these parts
for the last 10 years or so, and I'm very happy with RDA/FRBR
directions of late, taking at least RDF/Linked Data seriously. I'm
thus not convinced you understood what I wrote, and if nothing else,
my bad. I'll try again.

 That's no more a problem of RDF than any other system.

Yes, it is. RDF is promoted as a solution to a big problem of findable
and shareable meta data, however until you understand and use the full
RDF cake, you're scratching the surface and doing things sloppy (and
I'd argue, badly). The whole idea of strict ontologies is rigor,
consistency and better means of normalising the meta data so we all
can use it to represent the same things we're talking about. But the
question to every piece of meta data is *authority*, which is the part
of RDF that sucks. Currently it's all balanced on WikiPedia and
dbPedia, which isn't a bad thing all in itself, but neither of those
two are static nor authoritative in the same way, say, a global
library organisation might be. With RDF, people are slowly being
trained to accept all manners of crap meta data, and we as librarians
should not be so eager to accept that. We can say what we like about
the current library tools and models (and, of course, we do; they're
not perfect), but there's a whole missing chunk of what makes RDF
'work' that is, well, sub-par for *knowledge representation*. And
that's our game, no?

The shorter version; the RDF cake with it myriad of layers and
standards are too complex for most people to get right, so Linked Data
comes along and try to be simpler by making the long goal harder to
achieve.

I'm not, however, *against* RDF. But I am for pointing out that RDF is
neither easy to work with, nor ideal for any long-term goals we might
have in knowledge representation. RDF could have been made a lot
better which has better solutions upstream, but most of this RDF talk
is stuck in 1.0 territory, suffering the sins of former versions.

 And then there's that tedious distinction between a web resource and
 something that represents the thing in reality that RDF skipped (and
 hacked a 304 solution to). It's all a bit messy.

 That RDF skipped? No, *RDF* didn't skip it nor did RDF propose the *303*
 solution. You can use URIs to identify anything.

I think my point was that since representation is so important to any
goal you have for RDF (and the rest of the stack) it was a mistake to
not get it right *first*. OWL has better means of dealing with it, but
then, complexity, yadda, yadda.

 http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14
 And it's not messy, it's very clean.

Subjective, of course. Have you ever played with an 

Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Ross Singer
And yet for the last 50 years they've been creating MARC?

For the last 20, they've been making EAD, TEI, etc?

As with any of these, there is an expectation that end users will not be
hand rolling machine readable serializations, but inputting into
interfaces.

That is not to say there aren't headaches with RDF (there is no assumption
of order of triples, for example), but associating properties with entity
in which they actually belong, I would argue, is its real strength.

-Ross.
On Nov 3, 2013 10:30 PM, Eric Lease Morgan emor...@nd.edu wrote:

 On Nov 3, 2013, at 6:07 PM, Robert Sanderson azarot...@gmail.com wrote:

  And it's not very hard given the right mindset -- its just a fully
 expanded
  relational database, where the identifiers are URIs.  Yes, it's not 1st
  year computer science, but it is 2nd or 3rd year rather than post
 graduate.

 Okay, granted, but how many people do we know who can draw an entity
 relationship diagram? In other words, how many people can represent
 knowledge as a relational database? Very few people in Library Land are
 able to get past flat files, let alone relational databases. Yet we are
 hoping to build the Semantic Web where everybody can contribute. I think
 this is a challenge.

 Don’t get me wrong. I think this is a good thing to give a whirl, but I
 think it is hard.

 —
 ELM



Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Ross Singer
Eric,

I can't help but think that part of your problem is that you're using
RDF/XML, which definitely makes it harder to understand and visualize the
data model.

It might help if you switched to an RDF native serialization, like Turtle,
which definitely helps with regards to seeing RDF.

-Ross.
On Nov 4, 2013 6:29 AM, Ross Singer rossfsin...@gmail.com wrote:

 And yet for the last 50 years they've been creating MARC?

 For the last 20, they've been making EAD, TEI, etc?

 As with any of these, there is an expectation that end users will not be
 hand rolling machine readable serializations, but inputting into
 interfaces.

 That is not to say there aren't headaches with RDF (there is no assumption
 of order of triples, for example), but associating properties with entity
 in which they actually belong, I would argue, is its real strength.

 -Ross.
 On Nov 3, 2013 10:30 PM, Eric Lease Morgan emor...@nd.edu wrote:

 On Nov 3, 2013, at 6:07 PM, Robert Sanderson azarot...@gmail.com wrote:

  And it's not very hard given the right mindset -- its just a fully
 expanded
  relational database, where the identifiers are URIs.  Yes, it's not 1st
  year computer science, but it is 2nd or 3rd year rather than post
 graduate.

 Okay, granted, but how many people do we know who can draw an entity
 relationship diagram? In other words, how many people can represent
 knowledge as a relational database? Very few people in Library Land are
 able to get past flat files, let alone relational databases. Yet we are
 hoping to build the Semantic Web where everybody can contribute. I think
 this is a challenge.

 Don’t get me wrong. I think this is a good thing to give a whirl, but I
 think it is hard.

 —
 ELM




Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Eric Lease Morgan
I am of two minds when it comes to Linked Data and the Semantic Web.

Libraries and many other professions have been encoding things for a long time, 
but encoding the description of a book (MARC) or marking up texts (TEI), is not 
the same as encoding knowledge — a goal of the Semantic Web. The former is a 
process of enhancing — the adding of metadata — to an existing object. The 
later is a process of making assertions of truth. And in the case of the 
former, look at all the variations of describing a book, and think of all the 
different ways a person can mark up a text. We can’t agree.

In general, people do not think very systematically nor very logically. We are 
humans full of ambiguity, feelings, and perceptions. We are more animal than we 
are computer. We are more heart than we are mind. We are more like Leonard 
McCoy and less like Spock. Listen to people talk. Quite frequently we do not 
speak in complete sentences, and complete “sentences” are at the heart of the 
Linked Data and the Semantic Web. Think how much we rely on body language to 
convey ideas. If we — as a whole — have this difficulty, then how can we expect 
to capture and encode data, information, and knowledge with the rigor that a 
computer requires, no matter how many front-ends and layers are inserted 
between us and the triples?

Don’t get me wrong. I am of two minds when it comes to Linked Data and the 
Semantic Web. On one hand I believe the technology (think triples) is a descent 
fit and reasonable way to represent data, information, and knowledge. Heck I’m 
writing a book on the subject with examples of how to accomplish this goal. I 
am sincerely not threatened by this technology, nor do any of the RDF 
serializations get in my way. On the other hand, I just as sincerely wonder if 
the majority of people can manifest the rigor required by truly stupid and 
unforgiving computers to articulate knowledge.

—
Eric “Spoken Like A Humanist And Less Like A Computer Scientist” Morgan
University of Notre Dame


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Karen Coyle

On 11/3/13 12:45 PM, Eric Lease Morgan wrote:

Cool input. Thank you. I believe I have tweaked my assertions:

1. The Declaration of Independence was written by Thomas Jefferson

rdf:RDF
   xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
   xmlns:dc=http://purl.org/dc/elements/1.1/; 

   rdf:Description
   
rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html;
 dc:creatorhttp://id.loc.gov/authorities/names/n79089957/dc:creator
   /rdf:Description

/rdf:RDF


To refer to the DoI itself rather than a web page you can use either a 
wikipedia or dbpedia URI:

http://en.wikipedia.org/wiki/Declaration_of_Independence

Also, as has been mentioned, it would be best to use dcterms rather than 
dc elements, since the assumption with dcterms is that the value is an 
identifier rather than a string. So you need:


http://purl.org/dc/terms/

which is either expressed as dct or dcterms

The dc/1.1/ has in  a sense been upgraded by dc/terms/ but I recently 
did a study of actual usage of Dublin Core in linked data and in fact 
both are heavily used, although dcterms is by far the most common usage 
due to its compatibility with RDF.


  http://kcoyle.blogspot.com/2013/10/dublin-core-usage-in-lod.html
http://kcoyle.blogspot.com/2013/10/who-uses-dublin-core-dcterms.html
http://kcoyle.blogspot.com/2013/10/who-uses-dublin-core-original-15.html



2. Thomas Jefferson is a male person

rdf:RDF
   xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
   xmlns:foaf=http://xmlns.com/foaf/0.1/;

   rdf:Description rdf:about=http://id.loc.gov/authorities/names/n7908995;
 foaf:Person foaf:gender=male /
   /rdf:Description

/rdf:RDF


Using no additional vocabularies (ontologies), I think my hypothetical Linked 
Data spider / robot ought to be able to assert the following:

3. The Declaration of Independence was written by Thomas Jefferson, a male 
person

rdf:RDF
  xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
  xmlns:dc=http://purl.org/dc/elements/1.1/;
  xmlns:foaf=http://xmlns.com/foaf/0.1/;

   rdf:Description
   
rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html;
   dc:creator
 foaf:Person rdf:about=http://id.loc.gov/authorities/names/n79089957;
   foaf:gendermale/foaf:gender
 /foaf:Person
   /dc:creator
   /rdf:Description

/rdf:RDF

The W3C Validator…validates Assertion #3, and returns the attached graph, which 
illustrates the logical combination of Assertion #1 and #2.

This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using a 
strict syntax, specifically a strict syntax of triples. It is very difficult 
for humans to articulate knowledge, let alone codifying it. How realistic is 
the idea of the Semantic Web? I wonder this not because I don’t think the 
technology can handle the problem. I say this because I think people can’t (or 
have great difficulty) succinctly articulating knowledge. Or maybe knowledge 
does not fit into triples?


I agree that it is hard, although it gets easier as you lose some of 
your current data processing baggage and begin to think more in terms of 
triples. For that, like Ross, I really advise you not to do your work in 
RDF/XML -- in a sense RDF/XML is a kluge to force RDF into XML, and it 
is much more complex than RDF in turtle or plain triples.


I also agree that not all knowledge may fit nicely into triples. RDF is 
great for articulations of things and relationships. Your example here 
is a perfect one for RDF. In fact, it is very simple conceptually and 
could be quite simple as triples. Conceptually you are saying:


URI:DoI - dct:creator - URI:TJeff
URI:Tjeff - RDF:type - foaf:Person
URI:Tjeff - foaf:gender - male
  !-- I bet we can find a URI for male/female/? --

I've experimented a bit with using iPython (with Notebook) and the 
python rdflib, which can create a virtual triple-store that you can 
query against:

  http://www.rdflib.net/

Again, it's all soo much easier if you don't use rdfxml.

kc



—
Eric Morgan
University of Notre Dame

[cid:6A4E613F-CE41-4D35-BDFA-2E66EE7AF20A]




--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Karen Coyle

+1.

kc

On 11/4/13 3:40 AM, Ross Singer wrote:

Eric,

I can't help but think that part of your problem is that you're using
RDF/XML, which definitely makes it harder to understand and visualize the
data model.

It might help if you switched to an RDF native serialization, like Turtle,
which definitely helps with regards to seeing RDF.

-Ross.
On Nov 4, 2013 6:29 AM, Ross Singer rossfsin...@gmail.com wrote:


And yet for the last 50 years they've been creating MARC?

For the last 20, they've been making EAD, TEI, etc?

As with any of these, there is an expectation that end users will not be
hand rolling machine readable serializations, but inputting into
interfaces.

That is not to say there aren't headaches with RDF (there is no assumption
of order of triples, for example), but associating properties with entity
in which they actually belong, I would argue, is its real strength.

-Ross.
On Nov 3, 2013 10:30 PM, Eric Lease Morgan emor...@nd.edu wrote:


On Nov 3, 2013, at 6:07 PM, Robert Sanderson azarot...@gmail.com wrote:


And it's not very hard given the right mindset -- its just a fully

expanded

relational database, where the identifiers are URIs.  Yes, it's not 1st
year computer science, but it is 2nd or 3rd year rather than post

graduate.

Okay, granted, but how many people do we know who can draw an entity
relationship diagram? In other words, how many people can represent
knowledge as a relational database? Very few people in Library Land are
able to get past flat files, let alone relational databases. Yet we are
hoping to build the Semantic Web where everybody can contribute. I think
this is a challenge.

Don’t get me wrong. I think this is a good thing to give a whirl, but I
think it is hard.

—
ELM



--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Aaron Rubinstein
+1! Well said, Karen.

I would add (to further abuse your metaphor) that it’s also possible to make a 
delicious dish with simple ingredients. With minimal knowledge, most 
non-computer science-y folks can cook up some structured data in RDF, maybe 
encoded in RDFa and deliver it on the same HTML pages they are already 
presenting to the public, and add a surprising large amount of value to the 
information they publish.

I do completely agree that there’s some intellectual work necessary to do this 
effectively but the same is certainly true about metadata creation. In fact, I 
would say that those with library backgrounds are well suited to shape and 
present knowledge for machine processing. 

Finally, the same principles of publishing information on the human-readable 
Web apply to the structured data Web. Anyone can say anything about anything, 
it’s just up to us to figure out whether that information is meaningful or 
accurate. The more we build trusted sources by publishing and shaping that 
information with standards, best practices, and transparency, the more 
effective the future Web will be.

Aaron 

On Nov 4, 2013, at 9:59 AM, Karen Coyle li...@kcoyle.net wrote:

 Eric, I really don't see how RDF or linked data is any more difficult to 
 grasp than a database design -- and database design is a tool used by 
 developers to create information systems for people who will never have to 
 think about database design. Imagine the rigor that goes into the creation of 
 the app Angry Birds and imagine how many users are even aware of the 
 calculation of trajectories, speed, and the inter-relations between things on 
 the screen that will fall or explode or whatever.
 
 A master chef understands the chemistry of his famous dessert - the rest of 
 us just eat and enjoy.
 
 kc
 
 On 11/4/13 6:40 AM, Eric Lease Morgan wrote:
 I am of two minds when it comes to Linked Data and the Semantic Web.
 
 Libraries and many other professions have been encoding things for a long 
 time, but encoding the description of a book (MARC) or marking up texts 
 (TEI), is not the same as encoding knowledge — a goal of the Semantic Web. 
 The former is a process of enhancing — the adding of metadata — to an 
 existing object. The later is a process of making assertions of truth. And 
 in the case of the former, look at all the variations of describing a book, 
 and think of all the different ways a person can mark up a text. We can’t 
 agree.
 
 In general, people do not think very systematically nor very logically. We 
 are humans full of ambiguity, feelings, and perceptions. We are more animal 
 than we are computer. We are more heart than we are mind. We are more like 
 Leonard McCoy and less like Spock. Listen to people talk. Quite frequently 
 we do not speak in complete sentences, and complete “sentences” are at the 
 heart of the Linked Data and the Semantic Web. Think how much we rely on 
 body language to convey ideas. If we — as a whole — have this difficulty, 
 then how can we expect to capture and encode data, information, and 
 knowledge with the rigor that a computer requires, no matter how many 
 front-ends and layers are inserted between us and the triples?
 
 Don’t get me wrong. I am of two minds when it comes to Linked Data and the 
 Semantic Web. On one hand I believe the technology (think triples) is a 
 descent fit and reasonable way to represent data, information, and 
 knowledge. Heck I’m writing a book on the subject with examples of how to 
 accomplish this goal. I am sincerely not threatened by this technology, nor 
 do any of the RDF serializations get in my way. On the other hand, I just as 
 sincerely wonder if the majority of people can manifest the rigor required 
 by truly stupid and unforgiving computers to articulate knowledge.
 
 —
 Eric “Spoken Like A Humanist And Less Like A Computer Scientist” Morgan
 University of Notre Dame
 
 -- 
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: 1-510-435-8234
 skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Kyle Banerjee
 In general, people do not think very systematically nor very logically. We
 are humans full of ambiguity, feelings, and perceptions If we — as a
 whole — have this difficulty, then how can we expect to capture and encode
 data, information, and knowledge with the rigor that a computer requires...


Life is analog and context dependent so the hopeless inconsistency normally
found in metadata outside controlled environments should be expected.

Given how difficult it is to get good metadata from people for things they
know well and care about a great deal (how many people do you know who
don't have trouble managing personal photos and important files?), I
wouldn't hold my breath that there will be much useful human generated
metadata anytime soon. Despite their problems, heuristics strike me a
better way to go in the long term.

kyle


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Alexander Johannesen
Hiya,

On Tue, Nov 5, 2013 at 1:59 AM, Karen Coyle li...@kcoyle.net wrote:
 Eric, I really don't see how RDF or linked data is any more difficult to
 grasp than a database design

Well, there's at least one thing which makes people tilt; the flexible
structures for semantics (or, ontologies) in where things aren't as
solid as in a data model. A framework where there are endless options
(on the surface of it) for relationships between things is daunting to
people who come from a world where the options are cast in iron.
There's also a shift away from thing's identities being tied down in a
model somewhere into a world where identities are a bit more, hmm,
flexible? And less rigid? That can make some people cringe, as well.

 A master chef understands the chemistry of his famous dessert - the rest of
 us just eat and enjoy.

Hmm. Some of us will try to make that dessert again, for sure. :)


Alex


Re: [CODE4LIB] rdf serialization

2013-11-03 Thread Aaron Rubinstein
Hi Eric, 

Complex ideas that span multiple triples are often expressed through SPARQL. In 
other words, you store a soup of triple statements and the SPARQL query 
traverses the triples and presents the resulting information in a variety of 
formats, much in the same way you’d query a database using JOINs and present 
the resulting data on a single Web page.

Using your graph, this SPARQL query should return the work and the gender of 
the work's creator:

PREFIX dc: http://purl.org/dc/terms/
PREFIX foaf: http://xmlns.com/foaf/0.1/
SELECT ?work ?gender
WHERE {
?work dc:created ?creator .
?creator foaf:gender ?gender .
}


If you want to explicitly state that the Declaration of Independence was 
written by a male, you would need a predicate that’s set up to do that, 
something that takes a work as its domain and has a range of a gender. It would 
also help to have a class for gender. That way, you could have a triple 
statement like this:

http://www.worldcat.org/identities/lccn-n79-89957
foaf:name “Thomas Jefferson”
a :Male .

and you could infer that if:

http://www.archives.gov/exhibits/charters/declaration_transcript.html
dc:creator http://www.worldcat.org/identities/lccn-n79-89957 .

The creator of the Declaration is of class :Male:

http://www.archives.gov/exhibits/charters/declaration_transcript.html
:createdByGender :Male   

All the best, 

Aaron Rubinstein






On Nov 3, 2013, at 12:00 AM, Eric Lease Morgan emor...@nd.edu wrote:

 
 How can I write an RDF serialization enabling me to express the fact that the 
 United States Declaration Of Independence was written by Thomas Jefferson and 
 Thomas Jefferson was a male? (And thus asserting that the Declaration of 
 Independence was written by a male.)
 
 Suppose I have the following assertion:
 
  rdf:RDF
xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
xmlns:dc=http://purl.org/dc/elements/1.1/; 
 
!-- the Declaration Of Independence was authored by Thomas Jefferson --
rdf:Description

 rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html;
  
 dc:creatorhttp://www.worldcat.org/identities/lccn-n79-89957/dc:creator
/rdf:Description
 
  /rdf:RDF
 
 Suppose I have a second assertion:
 
  rdf:RDF
xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
xmlns:foaf=http://xmlns.com/foaf/0.1/;
 
!-- Thomas Jefferson was a male --
rdf:Description 
 rdf:about=http://www.worldcat.org/identities/lccn-n79-89957;
  foaf:gendermale/foaf:gender
/rdf:Description
 
  /rdf:RDF
 
 Now suppose a cool Linked Data robot came along and harvested my RDF/XML. 
 Moreover lets assume the robot could make the logical conclusion that the 
 Declaration was written by a male. How might the robot express this fact in 
 RDF/XML? The following is my first attempt at such an expression, but the 
 resulting graph (attached) doesn't seem to visually express what I really 
 want:
 
 rdf:RDF
  xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#”
  xmlns:foaf=http://xmlns.com/foaf/0.1/“
  xmlns:dc=http://purl.org/dc/elements/1.1/“
 
  rdf:Description 
 rdf:about=http://www.worldcat.org/identities/lccn-n79-89957;
foaf:gendermale/foaf:gender
  /rdf:Description
 
  rdf:Description
  
 rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html;
dc:creatorhttp://www.worldcat.org/identities/lccn-n79-89957/dc:creator
  /rdf:Description
 /rdf:RDF
 
 Am I doing something wrong? How might you encode such the following 
 expression — The Declaration Of Independence was authored by Thomas 
 Jefferson, and Thomas Jefferson was a male. And therefore, the Declaration Of 
 Independence was authored by a male named Thomas Jefferson? Maybe RDF can not 
 express this fact because it requires two predicates in a single expression, 
 and this the expression would not be a triple but rather a “quadrile — 
 object, predicate #1, subject/object, predicate #2, and subject?
 
 
 —
 Eric Morgan
 
 [cid:2A12C96F-E5C4-4C77-999C-B7FF5C2FA171@att.net]
 
 


Re: [CODE4LIB] rdf serialization

2013-11-03 Thread Eric Lease Morgan

Cool input. Thank you. I believe I have tweaked my assertions:

1. The Declaration of Independence was written by Thomas Jefferson

rdf:RDF
  xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
  xmlns:dc=http://purl.org/dc/elements/1.1/; 

  rdf:Description
  
rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html;
dc:creatorhttp://id.loc.gov/authorities/names/n79089957/dc:creator
  /rdf:Description

/rdf:RDF


2. Thomas Jefferson is a male person

rdf:RDF
  xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
  xmlns:foaf=http://xmlns.com/foaf/0.1/;

  rdf:Description rdf:about=http://id.loc.gov/authorities/names/n7908995;
foaf:Person foaf:gender=male /
  /rdf:Description

/rdf:RDF


Using no additional vocabularies (ontologies), I think my hypothetical Linked 
Data spider / robot ought to be able to assert the following:

3. The Declaration of Independence was written by Thomas Jefferson, a male 
person

rdf:RDF
 xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:dc=http://purl.org/dc/elements/1.1/;
 xmlns:foaf=http://xmlns.com/foaf/0.1/;

  rdf:Description
  
rdf:about=http://www.archives.gov/exhibits/charters/declaration_transcript.html;
  dc:creator
foaf:Person rdf:about=http://id.loc.gov/authorities/names/n79089957;
  foaf:gendermale/foaf:gender
/foaf:Person
  /dc:creator
  /rdf:Description

/rdf:RDF

The W3C Validator…validates Assertion #3, and returns the attached graph, which 
illustrates the logical combination of Assertion #1 and #2.

This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using a 
strict syntax, specifically a strict syntax of triples. It is very difficult 
for humans to articulate knowledge, let alone codifying it. How realistic is 
the idea of the Semantic Web? I wonder this not because I don’t think the 
technology can handle the problem. I say this because I think people can’t (or 
have great difficulty) succinctly articulating knowledge. Or maybe knowledge 
does not fit into triples?

—
Eric Morgan
University of Notre Dame

[cid:6A4E613F-CE41-4D35-BDFA-2E66EE7AF20A]

inline: graphic.png

Re: [CODE4LIB] rdf serialization

2013-11-03 Thread Robert Sanderson
You're still missing a vital step.

Currently your assertion is that the creator /of a web page/ is Jefferson,
which is clearly false.

The page (...) is a transcription of the Declaration of Independence.
The Declaration of Independence is written by Jefferson.
Jefferson is Male.

And it's not very hard given the right mindset -- its just a fully expanded
relational database, where the identifiers are URIs.  Yes, it's not 1st
year computer science, but it is 2nd or 3rd year rather than post graduate.

Which is not to say that people do not have great trouble succinctly
articulating knowledge, but like any skill, it can be learned. Just look at
the variation in the ways of writing papers ... some people can do it very
clearly, some have much more difficulty.

And with JSON-LD, you don't have to understand the RDF, just a clean
representation of it.

Rob



On Sun, Nov 3, 2013 at 1:45 PM, Eric Lease Morgan emor...@nd.edu wrote:


 Cool input. Thank you. I believe I have tweaked my assertions:

 1. The Declaration of Independence was written by Thomas Jefferson

 rdf:RDF
   xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
   xmlns:dc=http://purl.org/dc/elements/1.1/; 

   rdf:Description
   rdf:about=
 http://www.archives.gov/exhibits/charters/declaration_transcript.html;
 dc:creatorhttp://id.loc.gov/authorities/names/n79089957/dc:creator
   /rdf:Description

 /rdf:RDF


 2. Thomas Jefferson is a male person

 rdf:RDF
   xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
   xmlns:foaf=http://xmlns.com/foaf/0.1/;

   rdf:Description rdf:about=http://id.loc.gov/authorities/names/n7908995
 
 foaf:Person foaf:gender=male /
   /rdf:Description

 /rdf:RDF


 Using no additional vocabularies (ontologies), I think my hypothetical
 Linked Data spider / robot ought to be able to assert the following:

 3. The Declaration of Independence was written by Thomas Jefferson, a male
 person

 rdf:RDF
  xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
  xmlns:dc=http://purl.org/dc/elements/1.1/;
  xmlns:foaf=http://xmlns.com/foaf/0.1/;

   rdf:Description
   rdf:about=
 http://www.archives.gov/exhibits/charters/declaration_transcript.html;
   dc:creator
 foaf:Person rdf:about=
 http://id.loc.gov/authorities/names/n79089957;
   foaf:gendermale/foaf:gender
 /foaf:Person
   /dc:creator
   /rdf:Description

 /rdf:RDF

 The W3C Validator…validates Assertion #3, and returns the attached graph,
 which illustrates the logical combination of Assertion #1 and #2.

 This is hard. The Semantic Web (and RDF) attempt at codifying knowledge
 using a strict syntax, specifically a strict syntax of triples. It is very
 difficult for humans to articulate knowledge, let alone codifying it. How
 realistic is the idea of the Semantic Web? I wonder this not because I
 don’t think the technology can handle the problem. I say this because I
 think people can’t (or have great difficulty) succinctly articulating
 knowledge. Or maybe knowledge does not fit into triples?

 —
 Eric Morgan
 University of Notre Dame

 [cid:6A4E613F-CE41-4D35-BDFA-2E66EE7AF20A]




Re: [CODE4LIB] rdf serialization

2013-11-03 Thread Eric Lease Morgan
On Nov 3, 2013, at 6:07 PM, Robert Sanderson azarot...@gmail.com wrote:

 Currently your assertion is that the creator /of a web page/ is Jefferson,
 which is clearly false.
 
 The page (...) is a transcription of the Declaration of Independence.
 The Declaration of Independence is written by Jefferson.
 Jefferson is Male.

Okay. ‘Makes sense, but let’s find a URI for THE Declaration Of Independence — 
that thing under glass in the National Archives. —ELM


Re: [CODE4LIB] rdf serialization

2013-11-03 Thread Eric Lease Morgan
On Nov 3, 2013, at 6:07 PM, Robert Sanderson azarot...@gmail.com wrote:

 And it's not very hard given the right mindset -- its just a fully expanded
 relational database, where the identifiers are URIs.  Yes, it's not 1st
 year computer science, but it is 2nd or 3rd year rather than post graduate.

Okay, granted, but how many people do we know who can draw an entity 
relationship diagram? In other words, how many people can represent knowledge 
as a relational database? Very few people in Library Land are able to get past 
flat files, let alone relational databases. Yet we are hoping to build the 
Semantic Web where everybody can contribute. I think this is a challenge.

Don’t get me wrong. I think this is a good thing to give a whirl, but I think 
it is hard.

—
ELM