Re: [CODE4LIB] rdf serialization

2013-11-07 Thread Cary Gordon
It's a riddle where all of the answers can be purchased from Amazon!

Go AMZN!

Cary

On Nov 7, 2013, at 1:47 PM, Karen Coyle  wrote:

> Ross, I think you are not alone, as per this:
> 
> http://howfuckedismydatabase.com/nosql/
> 
> kc
> 
> On 11/6/13 8:54 AM, Ross Singer wrote:
>> Hey Karen,
>> 
>> It's purely anecdotal (albeit anecdotes borne from working at a company
>> that offered, and has since abandoned, a sparql-based triple store
>> service), but I just don't see the interest in arbitrary SPARQL queries
>> against remote datasets that I do against linking to (and grabbing) known
>> items.  I think there are multiple reasons for this:
>> 
>> 1) Unless you're already familiar with the dataset behind the SPARQL
>> endpoint, where do you even start with constructing useful queries?
>> 2) SPARQL as a query language is a combination of being too powerful and
>> completely useless in practice: query timeouts are commonplace, endpoints
>> don't support all of 1.1, etc.  And, going back to point #1, it's hard to
>> know how to optimize your queries unless you are already pretty familiar
>> with the data
>> 3) SPARQL is a flawed "API interface" from the get-go (IMHO) for the same
>> reason we don't offer a public SQL interface to our RDBMSes
>> 
>> Which isn't to say it doesn't have its uses or applications.
>> 
>> I just think that in most cases domain/service-specific APIs (be they
>> RESTful, based on the Linked Data API [0], whatever) will likely be favored
>> over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
>> sure the answer is "no", but that's the future I foresee, personally.
>> 
>> -Ross.
>> 0. https://code.google.com/p/linked-data-api/wiki/Specification
>> 
>> 
>> On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle  wrote:
>> 
>>> Ross, I agree with your statement that data doesn't have to be "RDF all
>>> the way down", etc. But I'd like to hear more about why you think SPARQL
>>> availability has less value, and if you see an alternative to SPARQL for
>>> querying.
>>> 
>>> kc
>>> 
>>> 
>>> 
>>> On 11/6/13 8:11 AM, Ross Singer wrote:
>>> 
 Hugh, I don't think you're in the weeds with your question (and, while I
 think that named graphs can provide a solution to your particular problem,
 that doesn't necessarily mean that it doesn't raise more questions or
 potentially more frustrations down the line - like any new power, it can
 be
 used for good or evil and the difference might not be obvious at first).
 
 My question for you, however, is why are you using a triple store for
 this?
   That is, why bother with the broad and general model in what I assume
 is a
 closed world assumption in your application?
 
 We don't generally use XML databases (Marklogic being a notable
 exception),
 or MARC databases, or -specific
 databases because usually transmission formats are designed to account for
 lots and lots of variations and maximum flexibility, which generally is
 the
 opposite of the modeling that goes into a specific app.
 
 I think there's a world of difference between modeling your data so it can
 be represented in RDF (and, possibly, available via SPARQL, but I think
 there is *far* less value there) and committing to RDF all the way down.
   RDF is a generalization so multiple parties can agree on what data
 means,
 but I would have a hard time swallowing the argument that domain-specific
 data must be RDF-native.
 
 -Ross.
 
 
 On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless 
 wrote:
 
  Does that work right down to the level of the individual triple though?
> If
> a large percentage of my triples are each in their own individual graphs,
> won't that be chaos? I really don't know the answer, it's not a
> rhetorical
> question!
> 
> Hugh
> 
> On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:
> 
>  Named Graphs are the way to solve the issue you bring up in that post,
>> in
>> my opinion.  You mint an identifier for the graph, and associate the
>> provenance and other information with that.  This then gets ingested as
>> 
> the
> 
>> 4th URI into a quad store, so you don't lose the provenance information.
>> 
>> In JSON-LD:
>> {
>>   "@id" : "uri-for-graph",
>>   "dcterms:creator" : "uri-for-hugh",
>>   "@graph" : [
>>// ... triples go here ...
>>   ]
>> }
>> 
>> Rob
>> 
>> 
>> 
>> On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless 
>> 
> wrote:
> 
>> I wrote about this a few months back at
>>>  http://blogs.library.duke.edu/dcthree/2013/07/27/the-
> trouble-with-triples/
> 
>> I'd be very interested to hear what the smart folks here think!
>>> Hugh
>>> 
>>> On Nov 5, 2013, at 18:28 , Alexander Johannesen <
>>> alexander.johanne...@gmail.com> wrote:
>>> 

Re: [CODE4LIB] rdf serialization

2013-11-07 Thread Karen Coyle

Ross, I think you are not alone, as per this:

http://howfuckedismydatabase.com/nosql/

kc

On 11/6/13 8:54 AM, Ross Singer wrote:

Hey Karen,

It's purely anecdotal (albeit anecdotes borne from working at a company
that offered, and has since abandoned, a sparql-based triple store
service), but I just don't see the interest in arbitrary SPARQL queries
against remote datasets that I do against linking to (and grabbing) known
items.  I think there are multiple reasons for this:

1) Unless you're already familiar with the dataset behind the SPARQL
endpoint, where do you even start with constructing useful queries?
2) SPARQL as a query language is a combination of being too powerful and
completely useless in practice: query timeouts are commonplace, endpoints
don't support all of 1.1, etc.  And, going back to point #1, it's hard to
know how to optimize your queries unless you are already pretty familiar
with the data
3) SPARQL is a flawed "API interface" from the get-go (IMHO) for the same
reason we don't offer a public SQL interface to our RDBMSes

Which isn't to say it doesn't have its uses or applications.

I just think that in most cases domain/service-specific APIs (be they
RESTful, based on the Linked Data API [0], whatever) will likely be favored
over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
sure the answer is "no", but that's the future I foresee, personally.

-Ross.
0. https://code.google.com/p/linked-data-api/wiki/Specification


On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle  wrote:


Ross, I agree with your statement that data doesn't have to be "RDF all
the way down", etc. But I'd like to hear more about why you think SPARQL
availability has less value, and if you see an alternative to SPARQL for
querying.

kc



On 11/6/13 8:11 AM, Ross Singer wrote:


Hugh, I don't think you're in the weeds with your question (and, while I
think that named graphs can provide a solution to your particular problem,
that doesn't necessarily mean that it doesn't raise more questions or
potentially more frustrations down the line - like any new power, it can
be
used for good or evil and the difference might not be obvious at first).

My question for you, however, is why are you using a triple store for
this?
   That is, why bother with the broad and general model in what I assume
is a
closed world assumption in your application?

We don't generally use XML databases (Marklogic being a notable
exception),
or MARC databases, or -specific
databases because usually transmission formats are designed to account for
lots and lots of variations and maximum flexibility, which generally is
the
opposite of the modeling that goes into a specific app.

I think there's a world of difference between modeling your data so it can
be represented in RDF (and, possibly, available via SPARQL, but I think
there is *far* less value there) and committing to RDF all the way down.
   RDF is a generalization so multiple parties can agree on what data
means,
but I would have a hard time swallowing the argument that domain-specific
data must be RDF-native.

-Ross.


On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless 
wrote:

  Does that work right down to the level of the individual triple though?

If
a large percentage of my triples are each in their own individual graphs,
won't that be chaos? I really don't know the answer, it's not a
rhetorical
question!

Hugh

On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:

  Named Graphs are the way to solve the issue you bring up in that post,

in
my opinion.  You mint an identifier for the graph, and associate the
provenance and other information with that.  This then gets ingested as


the


4th URI into a quad store, so you don't lose the provenance information.

In JSON-LD:
{
   "@id" : "uri-for-graph",
   "dcterms:creator" : "uri-for-hugh",
   "@graph" : [
// ... triples go here ...
   ]
}

Rob



On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless 


wrote:


I wrote about this a few months back at

  http://blogs.library.duke.edu/dcthree/2013/07/27/the-

trouble-with-triples/


I'd be very interested to hear what the smart folks here think!

Hugh

On Nov 5, 2013, at 18:28 , Alexander Johannesen <
alexander.johanne...@gmail.com> wrote:

  But the

question to every piece of meta data is *authority*, which is the part
of RDF that sucks.


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet



--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-07 Thread Karen Coyle

Ethan, thanks, it's good to have examples.

I'd say that for "simple linking" SPARQL may not be necessary, perhaps 
should be avoided, but IF you need something ELSE, say a query WHERE you 
have conditions, THEN you may find that a query language is needed.


kc

On 11/6/13 9:14 AM, Ethan Gruber wrote:

I think that the answer to #1 is that if you want or expect people to use
your endpoint that you should document how it works: the ontologies, the
models, and a variety of example SPARQL queries, ranging from simple to
complex.  The British Museum's SPARQL endpoint (
http://collection.britishmuseum.org/sparql) is highly touted, but how many
people actually use it?  I understand your point about SPARQL being too
complicated for an API interface, but the best examples of services built
on SPARQL are probably the ones you don't even realize are built on SPARQL
(e.g., http://numismatics.org/ocre/id/ric.1%282%29.aug.4A#mapTab).  So on
one hand, perhaps only the most dedicated and hardcore researchers will
venture to construct SPARQL queries for your endpoint, but on the other,
you can build some pretty visualizations based on SPARQL queries conducted
in the background from the user's interaction with a simple html/javascript
based interface.

Ethan


On Wed, Nov 6, 2013 at 11:54 AM, Ross Singer  wrote:


Hey Karen,

It's purely anecdotal (albeit anecdotes borne from working at a company
that offered, and has since abandoned, a sparql-based triple store
service), but I just don't see the interest in arbitrary SPARQL queries
against remote datasets that I do against linking to (and grabbing) known
items.  I think there are multiple reasons for this:

1) Unless you're already familiar with the dataset behind the SPARQL
endpoint, where do you even start with constructing useful queries?
2) SPARQL as a query language is a combination of being too powerful and
completely useless in practice: query timeouts are commonplace, endpoints
don't support all of 1.1, etc.  And, going back to point #1, it's hard to
know how to optimize your queries unless you are already pretty familiar
with the data
3) SPARQL is a flawed "API interface" from the get-go (IMHO) for the same
reason we don't offer a public SQL interface to our RDBMSes

Which isn't to say it doesn't have its uses or applications.

I just think that in most cases domain/service-specific APIs (be they
RESTful, based on the Linked Data API [0], whatever) will likely be favored
over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
sure the answer is "no", but that's the future I foresee, personally.

-Ross.
0. https://code.google.com/p/linked-data-api/wiki/Specification


On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle  wrote:


Ross, I agree with your statement that data doesn't have to be "RDF all
the way down", etc. But I'd like to hear more about why you think SPARQL
availability has less value, and if you see an alternative to SPARQL for
querying.

kc



On 11/6/13 8:11 AM, Ross Singer wrote:


Hugh, I don't think you're in the weeds with your question (and, while I
think that named graphs can provide a solution to your particular

problem,

that doesn't necessarily mean that it doesn't raise more questions or
potentially more frustrations down the line - like any new power, it can
be
used for good or evil and the difference might not be obvious at first).

My question for you, however, is why are you using a triple store for
this?
   That is, why bother with the broad and general model in what I assume
is a
closed world assumption in your application?

We don't generally use XML databases (Marklogic being a notable
exception),
or MARC databases, or 
choice>-specific

databases because usually transmission formats are designed to account

for

lots and lots of variations and maximum flexibility, which generally is
the
opposite of the modeling that goes into a specific app.

I think there's a world of difference between modeling your data so it

can

be represented in RDF (and, possibly, available via SPARQL, but I think
there is *far* less value there) and committing to RDF all the way down.
   RDF is a generalization so multiple parties can agree on what data
means,
but I would have a hard time swallowing the argument that

domain-specific

data must be RDF-native.

-Ross.


On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless 
wrote:

  Does that work right down to the level of the individual triple though?

If
a large percentage of my triples are each in their own individual

graphs,

won't that be chaos? I really don't know the answer, it's not a
rhetorical
question!

Hugh

On Nov 6, 2013, at 10:40 , Robert Sanderson 

wrote:

  Named Graphs are the way to solve the issue you bring up in that post,

in
my opinion.  You mint an identifier for the graph, and associate the
provenance and other information with that.  This then gets ingested

as

the


4th URI into a quad store, so you don't lose the provenance

information.

In JSON-LD:
{
   "@id" : "

Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ross Singer
Hugh, I'm skeptical of this in a usable application or interface.

Applications have constraints.  There are predicates you care about, there
are values you display in specific ways.  There are expectations, based on
the domain, in the data that are either driven by the interface or the
needs of the consumers.

I have yet to see an example of "arbitrary and unexpected data" exposed in
an application that people actually use.

-Ross.


On Wed, Nov 6, 2013 at 11:39 AM, Hugh Cayless  wrote:

> The answer is purely because the RDF data model and the technology around
> it looks like it would almost do what we need it to.
>
> I do not, and cannot, assume a closed world. The open world assumption is
> one of the attractive things about RDF, in fact :-)
>
> Hugh
>
> On Nov 6, 2013, at 11:11 , Ross Singer  wrote:
>
> > My question for you, however, is why are you using a triple store for
> this?
> > That is, why bother with the broad and general model in what I assume is
> a
> > closed world assumption in your application?
>


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ethan Gruber
I think that the answer to #1 is that if you want or expect people to use
your endpoint that you should document how it works: the ontologies, the
models, and a variety of example SPARQL queries, ranging from simple to
complex.  The British Museum's SPARQL endpoint (
http://collection.britishmuseum.org/sparql) is highly touted, but how many
people actually use it?  I understand your point about SPARQL being too
complicated for an API interface, but the best examples of services built
on SPARQL are probably the ones you don't even realize are built on SPARQL
(e.g., http://numismatics.org/ocre/id/ric.1%282%29.aug.4A#mapTab).  So on
one hand, perhaps only the most dedicated and hardcore researchers will
venture to construct SPARQL queries for your endpoint, but on the other,
you can build some pretty visualizations based on SPARQL queries conducted
in the background from the user's interaction with a simple html/javascript
based interface.

Ethan


On Wed, Nov 6, 2013 at 11:54 AM, Ross Singer  wrote:

> Hey Karen,
>
> It's purely anecdotal (albeit anecdotes borne from working at a company
> that offered, and has since abandoned, a sparql-based triple store
> service), but I just don't see the interest in arbitrary SPARQL queries
> against remote datasets that I do against linking to (and grabbing) known
> items.  I think there are multiple reasons for this:
>
> 1) Unless you're already familiar with the dataset behind the SPARQL
> endpoint, where do you even start with constructing useful queries?
> 2) SPARQL as a query language is a combination of being too powerful and
> completely useless in practice: query timeouts are commonplace, endpoints
> don't support all of 1.1, etc.  And, going back to point #1, it's hard to
> know how to optimize your queries unless you are already pretty familiar
> with the data
> 3) SPARQL is a flawed "API interface" from the get-go (IMHO) for the same
> reason we don't offer a public SQL interface to our RDBMSes
>
> Which isn't to say it doesn't have its uses or applications.
>
> I just think that in most cases domain/service-specific APIs (be they
> RESTful, based on the Linked Data API [0], whatever) will likely be favored
> over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
> sure the answer is "no", but that's the future I foresee, personally.
>
> -Ross.
> 0. https://code.google.com/p/linked-data-api/wiki/Specification
>
>
> On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle  wrote:
>
> > Ross, I agree with your statement that data doesn't have to be "RDF all
> > the way down", etc. But I'd like to hear more about why you think SPARQL
> > availability has less value, and if you see an alternative to SPARQL for
> > querying.
> >
> > kc
> >
> >
> >
> > On 11/6/13 8:11 AM, Ross Singer wrote:
> >
> >> Hugh, I don't think you're in the weeds with your question (and, while I
> >> think that named graphs can provide a solution to your particular
> problem,
> >> that doesn't necessarily mean that it doesn't raise more questions or
> >> potentially more frustrations down the line - like any new power, it can
> >> be
> >> used for good or evil and the difference might not be obvious at first).
> >>
> >> My question for you, however, is why are you using a triple store for
> >> this?
> >>   That is, why bother with the broad and general model in what I assume
> >> is a
> >> closed world assumption in your application?
> >>
> >> We don't generally use XML databases (Marklogic being a notable
> >> exception),
> >> or MARC databases, or  choice>-specific
> >> databases because usually transmission formats are designed to account
> for
> >> lots and lots of variations and maximum flexibility, which generally is
> >> the
> >> opposite of the modeling that goes into a specific app.
> >>
> >> I think there's a world of difference between modeling your data so it
> can
> >> be represented in RDF (and, possibly, available via SPARQL, but I think
> >> there is *far* less value there) and committing to RDF all the way down.
> >>   RDF is a generalization so multiple parties can agree on what data
> >> means,
> >> but I would have a hard time swallowing the argument that
> domain-specific
> >> data must be RDF-native.
> >>
> >> -Ross.
> >>
> >>
> >> On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless 
> >> wrote:
> >>
> >>  Does that work right down to the level of the individual triple though?
> >>> If
> >>> a large percentage of my triples are each in their own individual
> graphs,
> >>> won't that be chaos? I really don't know the answer, it's not a
> >>> rhetorical
> >>> question!
> >>>
> >>> Hugh
> >>>
> >>> On Nov 6, 2013, at 10:40 , Robert Sanderson 
> wrote:
> >>>
> >>>  Named Graphs are the way to solve the issue you bring up in that post,
>  in
>  my opinion.  You mint an identifier for the graph, and associate the
>  provenance and other information with that.  This then gets ingested
> as
> 
> >>> the
> >>>
>  4th URI into a quad store, so you don't lo

Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ross Singer
Hey Karen,

It's purely anecdotal (albeit anecdotes borne from working at a company
that offered, and has since abandoned, a sparql-based triple store
service), but I just don't see the interest in arbitrary SPARQL queries
against remote datasets that I do against linking to (and grabbing) known
items.  I think there are multiple reasons for this:

1) Unless you're already familiar with the dataset behind the SPARQL
endpoint, where do you even start with constructing useful queries?
2) SPARQL as a query language is a combination of being too powerful and
completely useless in practice: query timeouts are commonplace, endpoints
don't support all of 1.1, etc.  And, going back to point #1, it's hard to
know how to optimize your queries unless you are already pretty familiar
with the data
3) SPARQL is a flawed "API interface" from the get-go (IMHO) for the same
reason we don't offer a public SQL interface to our RDBMSes

Which isn't to say it doesn't have its uses or applications.

I just think that in most cases domain/service-specific APIs (be they
RESTful, based on the Linked Data API [0], whatever) will likely be favored
over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
sure the answer is "no", but that's the future I foresee, personally.

-Ross.
0. https://code.google.com/p/linked-data-api/wiki/Specification


On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle  wrote:

> Ross, I agree with your statement that data doesn't have to be "RDF all
> the way down", etc. But I'd like to hear more about why you think SPARQL
> availability has less value, and if you see an alternative to SPARQL for
> querying.
>
> kc
>
>
>
> On 11/6/13 8:11 AM, Ross Singer wrote:
>
>> Hugh, I don't think you're in the weeds with your question (and, while I
>> think that named graphs can provide a solution to your particular problem,
>> that doesn't necessarily mean that it doesn't raise more questions or
>> potentially more frustrations down the line - like any new power, it can
>> be
>> used for good or evil and the difference might not be obvious at first).
>>
>> My question for you, however, is why are you using a triple store for
>> this?
>>   That is, why bother with the broad and general model in what I assume
>> is a
>> closed world assumption in your application?
>>
>> We don't generally use XML databases (Marklogic being a notable
>> exception),
>> or MARC databases, or -specific
>> databases because usually transmission formats are designed to account for
>> lots and lots of variations and maximum flexibility, which generally is
>> the
>> opposite of the modeling that goes into a specific app.
>>
>> I think there's a world of difference between modeling your data so it can
>> be represented in RDF (and, possibly, available via SPARQL, but I think
>> there is *far* less value there) and committing to RDF all the way down.
>>   RDF is a generalization so multiple parties can agree on what data
>> means,
>> but I would have a hard time swallowing the argument that domain-specific
>> data must be RDF-native.
>>
>> -Ross.
>>
>>
>> On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless 
>> wrote:
>>
>>  Does that work right down to the level of the individual triple though?
>>> If
>>> a large percentage of my triples are each in their own individual graphs,
>>> won't that be chaos? I really don't know the answer, it's not a
>>> rhetorical
>>> question!
>>>
>>> Hugh
>>>
>>> On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:
>>>
>>>  Named Graphs are the way to solve the issue you bring up in that post,
 in
 my opinion.  You mint an identifier for the graph, and associate the
 provenance and other information with that.  This then gets ingested as

>>> the
>>>
 4th URI into a quad store, so you don't lose the provenance information.

 In JSON-LD:
 {
   "@id" : "uri-for-graph",
   "dcterms:creator" : "uri-for-hugh",
   "@graph" : [
// ... triples go here ...
   ]
 }

 Rob



 On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless 

>>> wrote:
>>>
 I wrote about this a few months back at
>
>  http://blogs.library.duke.edu/dcthree/2013/07/27/the-
>>> trouble-with-triples/
>>>
 I'd be very interested to hear what the smart folks here think!
>
> Hugh
>
> On Nov 5, 2013, at 18:28 , Alexander Johannesen <
> alexander.johanne...@gmail.com> wrote:
>
>  But the
>> question to every piece of meta data is *authority*, which is the part
>> of RDF that sucks.
>>
>
> --
> Karen Coyle
> kco...@kcoyle.net http://kcoyle.net
> m: 1-510-435-8234
> skype: kcoylenet
>


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
The answer is purely because the RDF data model and the technology around it 
looks like it would almost do what we need it to.

I do not, and cannot, assume a closed world. The open world assumption is one 
of the attractive things about RDF, in fact :-)

Hugh

On Nov 6, 2013, at 11:11 , Ross Singer  wrote:

> My question for you, however, is why are you using a triple store for this?
> That is, why bother with the broad and general model in what I assume is a
> closed world assumption in your application?


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Karen Coyle
Ross, I agree with your statement that data doesn't have to be "RDF all 
the way down", etc. But I'd like to hear more about why you think SPARQL 
availability has less value, and if you see an alternative to SPARQL for 
querying.


kc


On 11/6/13 8:11 AM, Ross Singer wrote:

Hugh, I don't think you're in the weeds with your question (and, while I
think that named graphs can provide a solution to your particular problem,
that doesn't necessarily mean that it doesn't raise more questions or
potentially more frustrations down the line - like any new power, it can be
used for good or evil and the difference might not be obvious at first).

My question for you, however, is why are you using a triple store for this?
  That is, why bother with the broad and general model in what I assume is a
closed world assumption in your application?

We don't generally use XML databases (Marklogic being a notable exception),
or MARC databases, or -specific
databases because usually transmission formats are designed to account for
lots and lots of variations and maximum flexibility, which generally is the
opposite of the modeling that goes into a specific app.

I think there's a world of difference between modeling your data so it can
be represented in RDF (and, possibly, available via SPARQL, but I think
there is *far* less value there) and committing to RDF all the way down.
  RDF is a generalization so multiple parties can agree on what data means,
but I would have a hard time swallowing the argument that domain-specific
data must be RDF-native.

-Ross.


On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless  wrote:


Does that work right down to the level of the individual triple though? If
a large percentage of my triples are each in their own individual graphs,
won't that be chaos? I really don't know the answer, it's not a rhetorical
question!

Hugh

On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:


Named Graphs are the way to solve the issue you bring up in that post, in
my opinion.  You mint an identifier for the graph, and associate the
provenance and other information with that.  This then gets ingested as

the

4th URI into a quad store, so you don't lose the provenance information.

In JSON-LD:
{
  "@id" : "uri-for-graph",
  "dcterms:creator" : "uri-for-hugh",
  "@graph" : [
   // ... triples go here ...
  ]
}

Rob



On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless 

wrote:

I wrote about this a few months back at


http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/

I'd be very interested to hear what the smart folks here think!

Hugh

On Nov 5, 2013, at 18:28 , Alexander Johannesen <
alexander.johanne...@gmail.com> wrote:


But the
question to every piece of meta data is *authority*, which is the part
of RDF that sucks.


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ross Singer
Hugh, I don't think you're in the weeds with your question (and, while I
think that named graphs can provide a solution to your particular problem,
that doesn't necessarily mean that it doesn't raise more questions or
potentially more frustrations down the line - like any new power, it can be
used for good or evil and the difference might not be obvious at first).

My question for you, however, is why are you using a triple store for this?
 That is, why bother with the broad and general model in what I assume is a
closed world assumption in your application?

We don't generally use XML databases (Marklogic being a notable exception),
or MARC databases, or -specific
databases because usually transmission formats are designed to account for
lots and lots of variations and maximum flexibility, which generally is the
opposite of the modeling that goes into a specific app.

I think there's a world of difference between modeling your data so it can
be represented in RDF (and, possibly, available via SPARQL, but I think
there is *far* less value there) and committing to RDF all the way down.
 RDF is a generalization so multiple parties can agree on what data means,
but I would have a hard time swallowing the argument that domain-specific
data must be RDF-native.

-Ross.


On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless  wrote:

> Does that work right down to the level of the individual triple though? If
> a large percentage of my triples are each in their own individual graphs,
> won't that be chaos? I really don't know the answer, it's not a rhetorical
> question!
>
> Hugh
>
> On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:
>
> > Named Graphs are the way to solve the issue you bring up in that post, in
> > my opinion.  You mint an identifier for the graph, and associate the
> > provenance and other information with that.  This then gets ingested as
> the
> > 4th URI into a quad store, so you don't lose the provenance information.
> >
> > In JSON-LD:
> > {
> >  "@id" : "uri-for-graph",
> >  "dcterms:creator" : "uri-for-hugh",
> >  "@graph" : [
> >   // ... triples go here ...
> >  ]
> > }
> >
> > Rob
> >
> >
> >
> > On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless 
> wrote:
> >
> >> I wrote about this a few months back at
> >>
> http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
> >>
> >> I'd be very interested to hear what the smart folks here think!
> >>
> >> Hugh
> >>
> >> On Nov 5, 2013, at 18:28 , Alexander Johannesen <
> >> alexander.johanne...@gmail.com> wrote:
> >>
> >>> But the
> >>> question to every piece of meta data is *authority*, which is the part
> >>> of RDF that sucks.
> >>
>


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
In the kinds of data I have to deal with, who made an assertion, or what 
sources provide evidence for a statement are vitally important bits of 
information, so its not just a data-source integration problem, where you're 
taking batches of triples from different sources and putting them together. 
It's a question of how to encode "scholarly", messy, humanities data.

The answer of course, might be "don't use RDF for that" :-). I'd rather not 
invent something if I don't have to though.

Hugh

On Nov 6, 2013, at 10:56 , Robert Sanderson  wrote:

> A large number of triples that all have different provenance? I'm curious
> as to how you get them :)
> 
> Rob
> 
> 
> On Wed, Nov 6, 2013 at 8:52 AM, Hugh Cayless  wrote:
> 
>> Does that work right down to the level of the individual triple though? If
>> a large percentage of my triples are each in their own individual graphs,
>> won't that be chaos? I really don't know the answer, it's not a rhetorical
>> question!
>> 
>> Hugh
>> 
>> On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:
>> 
>>> Named Graphs are the way to solve the issue you bring up in that post, in
>>> my opinion.  You mint an identifier for the graph, and associate the
>>> provenance and other information with that.  This then gets ingested as
>> the
>>> 4th URI into a quad store, so you don't lose the provenance information.
>>> 
>>> In JSON-LD:
>>> {
>>> "@id" : "uri-for-graph",
>>> "dcterms:creator" : "uri-for-hugh",
>>> "@graph" : [
>>>  // ... triples go here ...
>>> ]
>>> }
>>> 
>>> Rob
>>> 
>>> 
>>> 
>>> On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless 
>> wrote:
>>> 
 I wrote about this a few months back at
 
>> http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
 
 I'd be very interested to hear what the smart folks here think!
 
 Hugh
 
 On Nov 5, 2013, at 18:28 , Alexander Johannesen <
 alexander.johanne...@gmail.com> wrote:
 
> But the
> question to every piece of meta data is *authority*, which is the part
> of RDF that sucks.
 
>> 


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Robert Sanderson
A large number of triples that all have different provenance? I'm curious
as to how you get them :)

Rob


On Wed, Nov 6, 2013 at 8:52 AM, Hugh Cayless  wrote:

> Does that work right down to the level of the individual triple though? If
> a large percentage of my triples are each in their own individual graphs,
> won't that be chaos? I really don't know the answer, it's not a rhetorical
> question!
>
> Hugh
>
> On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:
>
> > Named Graphs are the way to solve the issue you bring up in that post, in
> > my opinion.  You mint an identifier for the graph, and associate the
> > provenance and other information with that.  This then gets ingested as
> the
> > 4th URI into a quad store, so you don't lose the provenance information.
> >
> > In JSON-LD:
> > {
> >  "@id" : "uri-for-graph",
> >  "dcterms:creator" : "uri-for-hugh",
> >  "@graph" : [
> >   // ... triples go here ...
> >  ]
> > }
> >
> > Rob
> >
> >
> >
> > On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless 
> wrote:
> >
> >> I wrote about this a few months back at
> >>
> http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
> >>
> >> I'd be very interested to hear what the smart folks here think!
> >>
> >> Hugh
> >>
> >> On Nov 5, 2013, at 18:28 , Alexander Johannesen <
> >> alexander.johanne...@gmail.com> wrote:
> >>
> >>> But the
> >>> question to every piece of meta data is *authority*, which is the part
> >>> of RDF that sucks.
> >>
>


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
Does that work right down to the level of the individual triple though? If a 
large percentage of my triples are each in their own individual graphs, won't 
that be chaos? I really don't know the answer, it's not a rhetorical question!

Hugh

On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:

> Named Graphs are the way to solve the issue you bring up in that post, in
> my opinion.  You mint an identifier for the graph, and associate the
> provenance and other information with that.  This then gets ingested as the
> 4th URI into a quad store, so you don't lose the provenance information.
> 
> In JSON-LD:
> {
>  "@id" : "uri-for-graph",
>  "dcterms:creator" : "uri-for-hugh",
>  "@graph" : [
>   // ... triples go here ...
>  ]
> }
> 
> Rob
> 
> 
> 
> On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless  wrote:
> 
>> I wrote about this a few months back at
>> http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
>> 
>> I'd be very interested to hear what the smart folks here think!
>> 
>> Hugh
>> 
>> On Nov 5, 2013, at 18:28 , Alexander Johannesen <
>> alexander.johanne...@gmail.com> wrote:
>> 
>>> But the
>>> question to every piece of meta data is *authority*, which is the part
>>> of RDF that sucks.
>> 


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Robert Sanderson
Named Graphs are the way to solve the issue you bring up in that post, in
my opinion.  You mint an identifier for the graph, and associate the
provenance and other information with that.  This then gets ingested as the
4th URI into a quad store, so you don't lose the provenance information.

In JSON-LD:
{
  "@id" : "uri-for-graph",
  "dcterms:creator" : "uri-for-hugh",
  "@graph" : [
   // ... triples go here ...
  ]
}

Rob



On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless  wrote:

> I wrote about this a few months back at
> http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
>
> I'd be very interested to hear what the smart folks here think!
>
> Hugh
>
> On Nov 5, 2013, at 18:28 , Alexander Johannesen <
> alexander.johanne...@gmail.com> wrote:
>
> > But the
> > question to every piece of meta data is *authority*, which is the part
> > of RDF that sucks.
>


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
I wrote about this a few months back at 
http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/

I'd be very interested to hear what the smart folks here think!

Hugh

On Nov 5, 2013, at 18:28 , Alexander Johannesen 
 wrote:

> But the
> question to every piece of meta data is *authority*, which is the part
> of RDF that sucks.


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ben Companjen
I could have known it was a test! ;)

Thanks Karen :)

On 06-11-13 15:20, "Karen Coyle"  wrote:

>I guess if I want anyone to answer my emails, I need to post mistakes.


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Karen Coyle
Ben, Yes, I copied from the browser URIs, and that was sloppy. However, 
it was the quickest thing to do, plus it was addressed to a human, not a 
machine. The URI for the LC entry is there on the page. Unfortunately, 
the VIAF URI is called "Permalink" -- which isn't obvious.


I guess if I want anyone to answer my emails, I need to post mistakes. 
When I post correct information, my mail goes unanswered (not even a 
"thanks"). So, thanks, guys.


kc

On 11/6/13 12:47 AM, Ben Companjen wrote:

Karen,

The URIs you gave get me to webpages *about* the Declaration of
Independence. I'm sure it's just a copy/paste mistake, but in this context
you want the exact right URIs of course. And by "better" I guess you meant
"probably more widely used" and "probably longer lasting"? :)

LOC URI for the DoI (the work) is without .html:
http://id.loc.gov/authorities/names/n79029194


VIAF URI for the DoI is without trailing /:
http://viaf.org/viaf/179420344

Ben
http://companjen.name/id/BC <- me
http://companjen.name/id/BC.html <- about me


On 05-11-13 19:03, "Karen Coyle"  wrote:


Eric, I found an even better URI for you for the Declaration of
Independence:

http://id.loc.gov/authorities/names/n79029194.html

Now that could be seen as being representative of the name chosen by the
LC Name Authority, but the related VIAF record, as per the VIAF
definition of itself, represents the real world thing itself. That URI is:

http://viaf.org/viaf/179420344/

I noticed that this VIAF URI isn't linked from the Wikipedia page, so I
will add that.

kc


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Eric Lease Morgan
> Yes, I'm going to get sucked into this vi vs emacs argument for nostalgia's
> sake...

ROTFL, because that is exactly what I was thinking. “Vi is better. No, emacs. 
You are both wrong; it is all about BBedit!” Each tool whether they be editors, 
email clients, or RDF serializations all have their own strengths and 
weaknesses. Like religions, none of them are perfect, but they all have some 
value. —ELM


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ed Summers
On Wed, Nov 6, 2013 at 3:47 AM, Ben Companjen
 wrote:
> The URIs you gave get me to webpages *about* the Declaration of
> Independence. I'm sure it's just a copy/paste mistake, but in this context
> you want the exact right URIs of course. And by "better" I guess you meant
> "probably more widely used" and "probably longer lasting"? :)
>
> LOC URI for the DoI (the work) is without .html:
> http://id.loc.gov/authorities/names/n79029194
>
> VIAF URI for the DoI is without trailing /:
> http://viaf.org/viaf/179420344

Thanks for that Ben. IMHO it's (yet another) illustration of why the
W3C's approach to educating the world about URIs for real world things
hasn't quite caught on, while RESTful ones (promoted by the IETF)
have. If someone as knowledgeable as Karen can do that, what does it
say about our ability as practitioners to use URIs this way, and in
our ability to write software to do it as well?

In a REST world, when you get a 200 OK it doesn't mean the resource is
a Web Document. The resource can be anything, you just happened to
successfully get a representation of it. If you like you can provide
hints about the nature of the resource in the representation, but the
resource itself never goes over the wire, the representation does.
It's a subtle but important difference in two ways of looking at Web
architecture.

If you find yourself interested in making up your own mind about this
you can find the RESTful definitions of resource and representation in
the IETF HTTP RFCs, most recently as of a few weeks ago in draft [1].
You can find language about Web Documents (or at least its more recent
variant, Information Resource) in the W3C's Architecture of the World
Wide Web [2].

Obviously I'm biased towards the IETF's position on this. This is just
my personal opinion from my experience as a Web developer trying to
explain Linked Data to practitioners, looking at the Web we have, and
chatting with good friends who weren't afraid to tell me what they
thought.

//Ed

[1] http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-24#page-7
[2] http://www.w3.org/TR/webarch/#id-resources


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ben Companjen
Karen,

The URIs you gave get me to webpages *about* the Declaration of
Independence. I'm sure it's just a copy/paste mistake, but in this context
you want the exact right URIs of course. And by "better" I guess you meant
"probably more widely used" and "probably longer lasting"? :)

LOC URI for the DoI (the work) is without .html:
http://id.loc.gov/authorities/names/n79029194


VIAF URI for the DoI is without trailing /:
http://viaf.org/viaf/179420344

Ben
http://companjen.name/id/BC <- me
http://companjen.name/id/BC.html <- about me


On 05-11-13 19:03, "Karen Coyle"  wrote:

>Eric, I found an even better URI for you for the Declaration of
>Independence:
>
>http://id.loc.gov/authorities/names/n79029194.html
>
>Now that could be seen as being representative of the name chosen by the
>LC Name Authority, but the related VIAF record, as per the VIAF
>definition of itself, represents the real world thing itself. That URI is:
>
>http://viaf.org/viaf/179420344/
>
>I noticed that this VIAF URI isn't linked from the Wikipedia page, so I
>will add that.
>
>kc


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Alexander Johannesen
Hi,

Robert Sanderson  wrote:
> c) I've never used a Topic Maps application. (and see (a))

How do you know?

> There /are/ challenges with RDF [...]
> But for the vast majority of cases, the problems are solved (JSON-LD) or no
> one cares any more (httpRange14).

What are you trying to say here? That httpRange14 somehow solves some
issue, and we no longer need to worry about it?

>> Having said that, there's tuples of many kinds, it's only that the
>> triplet is the most used under the W3C banner. Many are using to a
>> more expressive quad, a few crazies , for example, even though that
>
> ad hominem? really? Your argument ceased to be valid right about here.

I think you're a touch sensitive, mate. "Crazies" as in, few and
knowledgeable (most RDF users these days don't know what tuples are,
and how they fit into the representation of data) but not mainstream.
I'm one of those crazies. It was meant in jest.

>> may or may not be a better way of dealing with it. In the end, it all
>> comes down to some variation over frames theory (or bundles); a
>> serialisation of key/value pairs with some ontological denotation for
>> what the semantics of that might be.
>
> Except that RDF follows the web architecture through the use of URIs for
> everything. That is not to be under-estimated in terms of scalability and
> long term usage.

So does Topic Maps. Not sure I get your point? This is just semantics
of the key dominator in tuple serialisation, there's nothing
revolutionary about that, it's just an ontological commitment used by
systems. URIs don't give you some magic advantage; they're still a
string of characters as far as representation is concerned, and I dare
say, this points out the flaw in httpRange14 right there; in order to
know representation you need to resolve the identifier, ie. there's a
movable dynamic part to what in most cases needs to be static. Not
saying I have the answer, mind you, but there are some fundamental
problems with knowledge representation in RDF that a lot of people
don't "care about" which I do feel people of a library bent should
care about.

>> But wait, there's more! [big snip]
>
> Your point? You don't like an ontology? #DDTT

My point was the very first words in the following paragraph;

>> Complexity.

And of course I like ontologies. I've bandied them around these parts
for the last 10 years or so, and I'm very happy with RDA/FRBR
directions of late, taking at least RDF/Linked Data seriously. I'm
thus not convinced you understood what I wrote, and if nothing else,
my bad. I'll try again.

> That's no more a problem of RDF than any other system.

Yes, it is. RDF is promoted as a solution to a big problem of findable
and shareable meta data, however until you understand and use the full
RDF cake, you're scratching the surface and doing things sloppy (and
I'd argue, badly). The whole idea of strict ontologies is rigor,
consistency and better means of normalising the meta data so we all
can use it to represent the same things we're talking about. But the
question to every piece of meta data is *authority*, which is the part
of RDF that sucks. Currently it's all balanced on WikiPedia and
dbPedia, which isn't a bad thing all in itself, but neither of those
two are static nor authoritative in the same way, say, a global
library organisation might be. With RDF, people are slowly being
trained to accept all manners of crap meta data, and we as librarians
should not be so eager to accept that. We can say what we like about
the current library tools and models (and, of course, we do; they're
not perfect), but there's a whole missing chunk of what makes RDF
'work' that is, well, sub-par for *knowledge representation*. And
that's our game, no?

The shorter version; the RDF cake with it myriad of layers and
standards are too complex for most people to get right, so Linked Data
comes along and try to be simpler by making the long goal harder to
achieve.

I'm not, however, *against* RDF. But I am for pointing out that RDF is
neither easy to work with, nor ideal for any long-term goals we might
have in knowledge representation. RDF could have been made a lot
better which has better solutions upstream, but most of this RDF talk
is stuck in 1.0 territory, suffering the sins of former versions.

>> And then there's that tedious distinction between a web resource and
>> something that represents the thing "in reality" that RDF skipped (and
>> hacked a 304 "solution" to). It's all a bit messy.
>
> That RDF skipped? No, *RDF* didn't skip it nor did RDF propose the *303*
> solution. You can use URIs to identify anything.

I think my point was that since representation is so important to any
goal you have for RDF (and the rest of the stack) it was a mistake to
not get it right *first*. OWL has better means of dealing with it, but
then, complexity, yadda, yadda.

> http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14
> And it's not messy, it's very clean.

Subjective, of course. H

Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Robert Sanderson
Yes, I'm going to get sucked into this vi vs emacs argument for nostalgia's
sake.


>From the linked, very outdated article:

> In fact, as far as I know I've never used an RDF application, nor do I
know of any that make me want to use them. > So what's wrong with this
picture?

a) Nothing.  You would never know if you've used a CORBA application
either. Or (insert infrastructure technology here) application.
b) You've never been to the BBC website? You've never used anything that
pulls in content from remote sites? Oh wait, see (a).
c) I've never used a Topic Maps application. (and see (a))

> I find most existing RDF/XML entirely unreadable
Patient: Doctor, Doctor it hurts when I use RDF/XML!
Doctor: Don't Do That Then.   (aka #DDTT)

Already covered in this thread. I'm a strong proponent of JSON-LD.

> I think that when we start to bring on board metadata-rich knowledge
monuments such as WorldCat ...

See VIAF in this thread. See, if you must, BIBFRAME in this thread.

There /are/ challenges with RDF, not going to argue against that. And in
fact I /have/ recently argued for it:
http://www.cni.org/news/video-rdf-failures-linked-data-letdowns/

But for the vast majority of cases, the problems are solved (JSON-LD) or no
one cares any more (httpRange14).  Named Graphs (those quads used by
crazies you refer to) solve the remaining issues, but aren't standard yet.
 They are, however, cleverly baked into JSON-LD for the time that they are.


On Tue, Nov 5, 2013 at 2:48 PM, Alexander Johannesen <
alexander.johanne...@gmail.com> wrote:
>
> Ross Singer  wrote:
> > This is definitely where RDF outclasses almost every alternative*,
>
> Having said that, there's tuples of many kinds, it's only that the
> triplet is the most used under the W3C banner. Many are using to a
> more expressive quad, a few crazies , for example, even though that

ad hominem? really? Your argument ceased to be valid right about here.

> may or may not be a better way of dealing with it. In the end, it all
> comes down to some variation over frames theory (or bundles); a
> serialisation of key/value pairs with some ontological denotation for
> what the semantics of that might be.

Except that RDF follows the web architecture through the use of URIs for
everything. That is not to be under-estimated in terms of scalability and
long term usage.


> But wait, there's more! We haven't touched upon the next layer of the
> cake; OWL, which is, more or less, an ontology for dealing with all
> things knowledge and web. And it kinda puzzles me that it is not more
> often mentioned (or used) in the systems we make. A lot of OWL was
> tailored towards being a better language for expressing knowledge
> (which in itself comes from DAML and OIL ontologies), and then there's
> RDFs, and OWL in various formats, and then ...

Your point? You don't like an ontology? #DDTT


> Complexity. The problem, as far as I see it, is that there's not
> enough expression and rigor for the things we want to talk about in
> RDF, but we don't want to complicate things with OWL or RDFs either.

That's no more a problem of RDF than any other system.

> And then there's that tedious distinction between a web resource and
> something that represents the thing "in reality" that RDF skipped (and
> hacked a 304 "solution" to). It's all a bit messy.

That RDF skipped? No, *RDF* didn't skip it nor did RDF propose the *303*
solution.
You can use URIs to identify anything.

The 303/httprange14 issue is what happens when you *dereference* a URI that
identifies something that does not have a digital representation because
it's a real world object.  It has a direct impact on RDF, but came from the
TAG not the RDF WG.

http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14

And it's not messy, it's very clean. What it is not, is pragmatic. URIs are
like kittens ... practically free to get, but then you have a kitten to
look after and that costs money.  Thus doubling up your URIs is increasing
the number of kittens you have. [though likely not, in practice, doubling
the cost]

> > * Unless you're writing a parser, then having a kajillion serializations
> > seriously sucks.
> Some of us do. And yes, it sucks. I wonder about non-political
> solutions ever being possible again ...

This I agree with.

Rob


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Alexander Johannesen
Ross Singer  wrote:
> This is definitely where RDF outclasses almost every alternative*, because
> each serialization (besides RDF/XML) works extremely well for specific
> purposes [...]

Hmm. That depends on what you mean by "alternative to RDF
serialisation". I can think of a few, amongst them obviously (for me)
is Topic Maps which don't go down the evil triplet way with conversion
back and to an underlying data model.

Having said that, there's tuples of many kinds, it's only that the
triplet is the most used under the W3C banner. Many are using to a
more expressive quad, a few crazies , for example, even though that
may or may not be a better way of dealing with it. In the end, it all
comes down to some variation over frames theory (or bundles); a
serialisation of key/value pairs with some ontological denotation for
what the semantics of that might be.

It's hard to express what we perceive as knowledge in any notational
form. The models and languages we propose are far inferior to what is
needed for a world as complex as it is. But as you quoted George Box,
some models are more useful than others.

My personal experience is that I've got a hatred for RDF and triplets
for many of the same reasons Eric touch on, and as many know, I prefer
the more direct meta model of Topic Maps. However, these two different
serialisation and meta model frameworks are - lo and behold! -
compatible; there's canonical lossless conversion between the two. So
the argument at this point comes down to personal taste for what makes
more sense to you.

As to more on problems of RDF, read this excellent (but slighlt dated)
Bray article;
   http://www.tbray.org/ongoing/When/200x/2003/05/21/RDFNet

But wait, there's more! We haven't touched upon the next layer of the
cake; OWL, which is, more or less, an ontology for dealing with all
things knowledge and web. And it kinda puzzles me that it is not more
often mentioned (or used) in the systems we make. A lot of OWL was
tailored towards being a better language for expressing knowledge
(which in itself comes from DAML and OIL ontologies), and then there's
RDFs, and OWL in various formats, and then ...

Complexity. The problem, as far as I see it, is that there's not
enough expression and rigor for the things we want to talk about in
RDF, but we don't want to complicate things with OWL or RDFs either.
And then there's that tedious distinction between a web resource and
something that represents the thing "in reality" that RDF skipped (and
hacked a 304 "solution" to). It's all a bit messy.

> * Unless you're writing a parser, then having a kajillion serializations
> seriously sucks.

Some of us do. And yes, it sucks. I wonder about non-political
solutions ever being possible again ...


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
 http://shelter.nu/blog  |  google.com/+AlexanderJohannesen  |
http://xsiteable.org
 http://www.linkedin.com/in/shelterit


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Ross Singer
On Tue, Nov 5, 2013 at 9:45 AM, Ed Summers  wrote:

> On Sun, Nov 3, 2013 at 3:45 PM, Eric Lease Morgan  wrote:
> > This is hard. The Semantic Web (and RDF) attempt at codifying knowledge
> using a strict syntax, specifically a strict syntax of triples. It is very
> difficult for humans to articulate knowledge, let alone codifying it. How
> realistic is the idea of the Semantic Web? I wonder this not because I
> don’t think the technology can handle the problem. I say this because I
> think people can’t (or have great difficulty) succinctly articulating
> knowledge. Or maybe knowledge does not fit into triples?
>
> I think you're right Eric. I don't think knowledge can be encoded
> completely in triples, any more than it can be encoded completely in
> finding aids or books.
>

Or... anything, honestly.  We're humans. Our understanding and perception
of the universe changes daily.  I don't think it's unreasonable to accept
that any description of the universe, input by a human, will reflect the
fundamental reality that was was encoded might be wrong.  I don't really
buy the argument that RDF is somehow less capable of "succinctly
articulating knowledge" compared to anything else.  "All models are wrong.
 Some are useful."

>
> One thing that I (naively) wasn't fully aware of when I started
> dabbling the Semantic Web and Linked Data is how much the technology
> is entangled with debates about the philosophy of language. These
> debates play out in a variety of ways, but most notably in
> disagreements about the nature of a resource (httpRange-14) in Web
> Architecture. Shameless plug: Dorothea Salo and I tried to write about
> how some of this impacts the domain of the library/archive [1].
>
> OTOH, schema.org doesn't concern itself at all with this dichotomy
(information vs. non-information resource) and I think that most (sane,
pragmatic) practitioners would consider that "linked data", as well.  Given
the fact that schema.org is so easily mapped to RDF, I think this argument
is going to be so polluted (if it isn't already) that it will eventually
have to evolve to a far less academic position.

One of the strengths of RDF is its notion of a data model that is
> behind the various serializations (xml, ntriples, json, n3, turtle,
> etc). I'm with Ross though: I find it much to read rdf as turtle or
> json-ld than it is rdf/xml.
>
> This is definitely where RDF outclasses almost every alternative*, because
each serialization (besides RDF/XML) works extremely well for specific
purposes:

Turtle is great for writing RDF (either to humans or computers) and being
able to understand what is being modeled.

n-triples/quads is great for sharing data in bulk.

json-ld is ideal for API responses, since the consumer doesn't have to know
anything about RDF to have a useful data object, but if they do, all the
better.

-Ross.
* Unless you're writing a parser, then having a kajillion serializations
seriously sucks.


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Karen Coyle
Eric, I found an even better URI for you for the Declaration of 
Independence:


http://id.loc.gov/authorities/names/n79029194.html

Now that could be seen as being representative of the name chosen by the 
LC Name Authority, but the related VIAF record, as per the VIAF 
definition of itself, represents the real world thing itself. That URI is:


http://viaf.org/viaf/179420344/

I noticed that this VIAF URI isn't linked from the Wikipedia page, so I 
will add that.


kc


On 11/2/13 9:00 PM, Eric Lease Morgan wrote:

How can I write an RDF serialization enabling me to express the fact that the 
United States Declaration Of Independence was written by Thomas Jefferson and 
Thomas Jefferson was a male? (And thus asserting that the Declaration of 
Independence was written by a male.)

Suppose I have the following assertion:

   http://www.w3.org/1999/02/22-rdf-syntax-ns#";
 xmlns:dc="http://purl.org/dc/elements/1.1/"; >

 
 http://www.archives.gov/exhibits/charters/declaration_transcript.html";>
   
http://www.worldcat.org/identities/lccn-n79-89957
 

   

Suppose I have a second assertion:

   http://www.w3.org/1999/02/22-rdf-syntax-ns#";
 xmlns:foaf="http://xmlns.com/foaf/0.1/";>

 
 http://www.worldcat.org/identities/lccn-n79-89957";>
   male
 

   

Now suppose a cool Linked Data robot came along and harvested my RDF/XML. 
Moreover lets assume the robot could make the logical conclusion that the 
Declaration was written by a male. How might the robot express this fact in 
RDF/XML? The following is my first attempt at such an expression, but the 
resulting graph (attached) doesn't seem to visually express what I really want:

http://www.w3.org/1999/02/22-rdf-syntax-ns#”
   xmlns:foaf="http://xmlns.com/foaf/0.1/“
   xmlns:dc="http://purl.org/dc/elements/1.1/“>

   http://www.worldcat.org/identities/lccn-n79-89957";>
 male
   

   http://www.archives.gov/exhibits/charters/declaration_transcript.html";>
 http://www.worldcat.org/identities/lccn-n79-89957
   


Am I doing something wrong? How might you encode such the following expression — The 
Declaration Of Independence was authored by Thomas Jefferson, and Thomas Jefferson 
was a male. And therefore, the Declaration Of Independence was authored by a male 
named Thomas Jefferson? Maybe RDF can not express this fact because it requires two 
predicates in a single expression, and this the expression would not be a triple but 
rather a “quadrile" — object, predicate #1, subject/object, predicate #2, and 
subject?


—
Eric Morgan

[cid:2A12C96F-E5C4-4C77-999C-B7FF5C2FA171@att.net]





--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Sheila M. Morrissey
Ed -- thanks for the link -- you and Dorothy have written a tremendously clear 
and useful piece
Sheila

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ed 
Summers
Sent: Tuesday, November 05, 2013 9:45 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] rdf serialization

On Sun, Nov 3, 2013 at 3:45 PM, Eric Lease Morgan  wrote:
> This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using 
> a strict syntax, specifically a strict syntax of triples. It is very 
> difficult for humans to articulate knowledge, let alone codifying it. How 
> realistic is the idea of the Semantic Web? I wonder this not because I don't 
> think the technology can handle the problem. I say this because I think 
> people can't (or have great difficulty) succinctly articulating knowledge. Or 
> maybe knowledge does not fit into triples?

I think you're right Eric. I don't think knowledge can be encoded completely in 
triples, any more than it can be encoded completely in finding aids or books.

One thing that I (naively) wasn't fully aware of when I started dabbling the 
Semantic Web and Linked Data is how much the technology is entangled with 
debates about the philosophy of language. These debates play out in a variety 
of ways, but most notably in disagreements about the nature of a resource 
(httpRange-14) in Web Architecture. Shameless plug: Dorothea Salo and I tried 
to write about how some of this impacts the domain of the library/archive [1].

One of the strengths of RDF is its notion of a data model that is behind the 
various serializations (xml, ntriples, json, n3, turtle, etc). I'm with Ross 
though: I find it much to read rdf as turtle or json-ld than it is rdf/xml.

//Ed

[1] http://arxiv.org/abs/1302.4591


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Ed Summers
On Tue, Nov 5, 2013 at 10:07 AM, Karen Coyle  wrote:
> I have suggested (repeatedly) to LC on the BIBFRAME list that they should
> use turtle rather than RDF/XML in their examples -- because I suspect that
> they may be doing some "XML think" in the background. This seems to be the
> case because in some of the BIBFRAME documents the examples are in XML but
> not RDF/XML. I find this rather ... disappointing.

I think you'll find that many people and organizations are much more
familiar with xml and its data model than they are with rdf. Sometimes
when people with a strong background in xml come to rdf they naturally
want to keep thinking in terms of xml. This is possible up to a point,
but it eventually hampers understanding.

//Ed


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Aaron Rubinstein
FWIW, 

Here’s the W3C’s RDF Primer with examples in turtle instead of RDF/XML:
http://www.w3.org/2007/02/turtle/primer/

And the turtle spec:
http://www.w3.org/TR/turtle/

Aaron


On Nov 5, 2013, at 10:07 AM, Karen Coyle  wrote:

> On 11/5/13 6:45 AM, Ed Summers wrote:
>> I'm with Ross though:
> ... and Karen!
> 
>> I find it much to read rdf as turtle or json-ld than it is rdf/xml.
> 
> It's easier to read, but it's also easier to create *correctly*, and that, to 
> me, is the key point. Folks who are used to XML have a certain notion of data 
> organization in mind. Working with RDF in XML one tends to fall into the XML 
> data "think" rather than the RDF concepts.
> 
> I have suggested (repeatedly) to LC on the BIBFRAME list that they should use 
> turtle rather than RDF/XML in their examples -- because I suspect that they 
> may be doing some "XML think" in the background. This seems to be the case 
> because in some of the BIBFRAME documents the examples are in XML but not 
> RDF/XML. I find this rather ... disappointing.
> 
> I also find it useful to create "pseudo-code" triples using whatever notation 
> I find handy, as in the example I provided earlier for Eric. Writing out 
> actual valid triples is a pain, but seeing your data as triples is very 
> useful.
> 
> kc
> 
> -- 
> Karen Coyle
> kco...@kcoyle.net http://kcoyle.net
> m: 1-510-435-8234
> skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Karen Coyle

On 11/5/13 6:45 AM, Ed Summers wrote:

I'm with Ross though:

... and Karen!


I find it much to read rdf as turtle or json-ld than it is rdf/xml.


It's easier to read, but it's also easier to create *correctly*, and 
that, to me, is the key point. Folks who are used to XML have a certain 
notion of data organization in mind. Working with RDF in XML one tends 
to fall into the XML data "think" rather than the RDF concepts.


I have suggested (repeatedly) to LC on the BIBFRAME list that they 
should use turtle rather than RDF/XML in their examples -- because I 
suspect that they may be doing some "XML think" in the background. This 
seems to be the case because in some of the BIBFRAME documents the 
examples are in XML but not RDF/XML. I find this rather ... disappointing.


I also find it useful to create "pseudo-code" triples using whatever 
notation I find handy, as in the example I provided earlier for Eric. 
Writing out actual valid triples is a pain, but seeing your data as 
triples is very useful.


kc

--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Ed Summers
On Sun, Nov 3, 2013 at 3:45 PM, Eric Lease Morgan  wrote:
> This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using 
> a strict syntax, specifically a strict syntax of triples. It is very 
> difficult for humans to articulate knowledge, let alone codifying it. How 
> realistic is the idea of the Semantic Web? I wonder this not because I don’t 
> think the technology can handle the problem. I say this because I think 
> people can’t (or have great difficulty) succinctly articulating knowledge. Or 
> maybe knowledge does not fit into triples?

I think you're right Eric. I don't think knowledge can be encoded
completely in triples, any more than it can be encoded completely in
finding aids or books.

One thing that I (naively) wasn't fully aware of when I started
dabbling the Semantic Web and Linked Data is how much the technology
is entangled with debates about the philosophy of language. These
debates play out in a variety of ways, but most notably in
disagreements about the nature of a resource (httpRange-14) in Web
Architecture. Shameless plug: Dorothea Salo and I tried to write about
how some of this impacts the domain of the library/archive [1].

One of the strengths of RDF is its notion of a data model that is
behind the various serializations (xml, ntriples, json, n3, turtle,
etc). I'm with Ross though: I find it much to read rdf as turtle or
json-ld than it is rdf/xml.

//Ed

[1] http://arxiv.org/abs/1302.4591


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Alexander Johannesen
Hiya,

On Tue, Nov 5, 2013 at 1:59 AM, Karen Coyle  wrote:
> Eric, I really don't see how RDF or linked data is any more difficult to
> grasp than a database design

Well, there's at least one thing which makes people tilt; the flexible
structures for semantics (or, ontologies) in where things aren't as
solid as in a data model. A framework where there are endless options
(on the surface of it) for relationships between things is daunting to
people who come from a world where the options are cast in iron.
There's also a shift away from thing's identities being tied down in a
model somewhere into a world where identities are a bit more, hmm,
flexible? And less rigid? That can make some people cringe, as well.

> A master chef understands the chemistry of his famous dessert - the rest of
> us just eat and enjoy.

Hmm. Some of us will try to make that dessert again, for sure. :)


Alex


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Kyle Banerjee
> In general, people do not think very systematically nor very logically. We
> are humans full of ambiguity, feelings, and perceptions If we — as a
> whole — have this difficulty, then how can we expect to capture and encode
> data, information, and knowledge with the rigor that a computer requires...


Life is analog and context dependent so the hopeless inconsistency normally
found in metadata outside controlled environments should be expected.

Given how difficult it is to get good metadata from people for things they
know well and care about a great deal (how many people do you know who
don't have trouble managing personal photos and important files?), I
wouldn't hold my breath that there will be much useful human generated
metadata anytime soon. Despite their problems, heuristics strike me a
better way to go in the long term.

kyle


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Aaron Rubinstein
+1! Well said, Karen.

I would add (to further abuse your metaphor) that it’s also possible to make a 
delicious dish with simple ingredients. With minimal knowledge, most 
non-computer science-y folks can cook up some structured data in RDF, maybe 
encoded in RDFa and deliver it on the same HTML pages they are already 
presenting to the public, and add a surprising large amount of value to the 
information they publish.

I do completely agree that there’s some intellectual work necessary to do this 
effectively but the same is certainly true about metadata creation. In fact, I 
would say that those with library backgrounds are well suited to shape and 
present knowledge for machine processing. 

Finally, the same principles of publishing information on the human-readable 
Web apply to the structured data Web. Anyone can say anything about anything, 
it’s just up to us to figure out whether that information is meaningful or 
accurate. The more we build trusted sources by publishing and shaping that 
information with standards, best practices, and transparency, the more 
effective the future Web will be.

Aaron 

On Nov 4, 2013, at 9:59 AM, Karen Coyle  wrote:

> Eric, I really don't see how RDF or linked data is any more difficult to 
> grasp than a database design -- and database design is a tool used by 
> developers to create information systems for people who will never have to 
> think about database design. Imagine the rigor that goes into the creation of 
> the app "Angry Birds" and imagine how many users are even aware of the 
> calculation of trajectories, speed, and the inter-relations between things on 
> the screen that will fall or explode or whatever.
> 
> A master chef understands the chemistry of his famous dessert - the rest of 
> us just eat and enjoy.
> 
> kc
> 
> On 11/4/13 6:40 AM, Eric Lease Morgan wrote:
>> I am of two minds when it comes to Linked Data and the Semantic Web.
>> 
>> Libraries and many other professions have been encoding things for a long 
>> time, but encoding the description of a book (MARC) or marking up texts 
>> (TEI), is not the same as encoding knowledge — a goal of the Semantic Web. 
>> The former is a process of enhancing — the adding of metadata — to an 
>> existing object. The later is a process of making assertions of truth. And 
>> in the case of the former, look at all the variations of describing a book, 
>> and think of all the different ways a person can mark up a text. We can’t 
>> agree.
>> 
>> In general, people do not think very systematically nor very logically. We 
>> are humans full of ambiguity, feelings, and perceptions. We are more animal 
>> than we are computer. We are more heart than we are mind. We are more like 
>> Leonard McCoy and less like Spock. Listen to people talk. Quite frequently 
>> we do not speak in complete sentences, and complete “sentences” are at the 
>> heart of the Linked Data and the Semantic Web. Think how much we rely on 
>> body language to convey ideas. If we — as a whole — have this difficulty, 
>> then how can we expect to capture and encode data, information, and 
>> knowledge with the rigor that a computer requires, no matter how many 
>> front-ends and layers are inserted between us and the triples?
>> 
>> Don’t get me wrong. I am of two minds when it comes to Linked Data and the 
>> Semantic Web. On one hand I believe the technology (think triples) is a 
>> descent fit and reasonable way to represent data, information, and 
>> knowledge. Heck I’m writing a book on the subject with examples of how to 
>> accomplish this goal. I am sincerely not threatened by this technology, nor 
>> do any of the RDF serializations get in my way. On the other hand, I just as 
>> sincerely wonder if the majority of people can manifest the rigor required 
>> by truly stupid and unforgiving computers to articulate knowledge.
>> 
>> —
>> Eric “Spoken Like A Humanist And Less Like A Computer Scientist” Morgan
>> University of Notre Dame
> 
> -- 
> Karen Coyle
> kco...@kcoyle.net http://kcoyle.net
> m: 1-510-435-8234
> skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Karen Coyle
Eric, I really don't see how RDF or linked data is any more difficult to 
grasp than a database design -- and database design is a tool used by 
developers to create information systems for people who will never have 
to think about database design. Imagine the rigor that goes into the 
creation of the app "Angry Birds" and imagine how many users are even 
aware of the calculation of trajectories, speed, and the inter-relations 
between things on the screen that will fall or explode or whatever.


A master chef understands the chemistry of his famous dessert - the rest 
of us just eat and enjoy.


kc

On 11/4/13 6:40 AM, Eric Lease Morgan wrote:

I am of two minds when it comes to Linked Data and the Semantic Web.

Libraries and many other professions have been encoding things for a long time, 
but encoding the description of a book (MARC) or marking up texts (TEI), is not 
the same as encoding knowledge — a goal of the Semantic Web. The former is a 
process of enhancing — the adding of metadata — to an existing object. The 
later is a process of making assertions of truth. And in the case of the 
former, look at all the variations of describing a book, and think of all the 
different ways a person can mark up a text. We can’t agree.

In general, people do not think very systematically nor very logically. We are 
humans full of ambiguity, feelings, and perceptions. We are more animal than we 
are computer. We are more heart than we are mind. We are more like Leonard 
McCoy and less like Spock. Listen to people talk. Quite frequently we do not 
speak in complete sentences, and complete “sentences” are at the heart of the 
Linked Data and the Semantic Web. Think how much we rely on body language to 
convey ideas. If we — as a whole — have this difficulty, then how can we expect 
to capture and encode data, information, and knowledge with the rigor that a 
computer requires, no matter how many front-ends and layers are inserted 
between us and the triples?

Don’t get me wrong. I am of two minds when it comes to Linked Data and the 
Semantic Web. On one hand I believe the technology (think triples) is a descent 
fit and reasonable way to represent data, information, and knowledge. Heck I’m 
writing a book on the subject with examples of how to accomplish this goal. I 
am sincerely not threatened by this technology, nor do any of the RDF 
serializations get in my way. On the other hand, I just as sincerely wonder if 
the majority of people can manifest the rigor required by truly stupid and 
unforgiving computers to articulate knowledge.

—
Eric “Spoken Like A Humanist And Less Like A Computer Scientist” Morgan
University of Notre Dame


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Karen Coyle

+1.

kc

On 11/4/13 3:40 AM, Ross Singer wrote:

Eric,

I can't help but think that part of your problem is that you're using
RDF/XML, which definitely makes it harder to understand and visualize the
data model.

It might help if you switched to an RDF native serialization, like Turtle,
which definitely helps with regards to "seeing" RDF.

-Ross.
On Nov 4, 2013 6:29 AM, "Ross Singer"  wrote:


And yet for the last 50 years they've been creating MARC?

For the last 20, they've been making EAD, TEI, etc?

As with any of these, there is an expectation that end users will not be
hand rolling machine readable serializations, but inputting into
interfaces.

That is not to say there aren't headaches with RDF (there is no assumption
of order of triples, for example), but associating properties with entity
in which they actually belong, I would argue, is its real strength.

-Ross.
On Nov 3, 2013 10:30 PM, "Eric Lease Morgan"  wrote:


On Nov 3, 2013, at 6:07 PM, Robert Sanderson  wrote:


And it's not very hard given the right mindset -- its just a fully

expanded

relational database, where the identifiers are URIs.  Yes, it's not 1st
year computer science, but it is 2nd or 3rd year rather than post

graduate.

Okay, granted, but how many people do we know who can draw an entity
relationship diagram? In other words, how many people can represent
knowledge as a relational database? Very few people in Library Land are
able to get past flat files, let alone relational databases. Yet we are
hoping to build the Semantic Web where everybody can contribute. I think
this is a challenge.

Don’t get me wrong. I think this is a good thing to give a whirl, but I
think it is hard.

—
ELM



--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Karen Coyle

On 11/3/13 12:45 PM, Eric Lease Morgan wrote:

Cool input. Thank you. I believe I have tweaked my assertions:

1. The Declaration of Independence was written by Thomas Jefferson

http://www.w3.org/1999/02/22-rdf-syntax-ns#";
   xmlns:dc="http://purl.org/dc/elements/1.1/"; >

   http://www.archives.gov/exhibits/charters/declaration_transcript.html";>
 http://id.loc.gov/authorities/names/n79089957
   




To refer to the DoI itself rather than a web page you can use either a 
wikipedia or dbpedia URI:

http://en.wikipedia.org/wiki/Declaration_of_Independence

Also, as has been mentioned, it would be best to use dcterms rather than 
dc elements, since the assumption with dcterms is that the value is an 
identifier rather than a string. So you need:


http://purl.org/dc/terms/

which is either expressed as "dct" or "dcterms"

The dc/1.1/ has in  a sense been "upgraded" by dc/terms/ but I recently 
did a study of actual usage of Dublin Core in linked data and in fact 
both are heavily used, although dcterms is by far the most common usage 
due to its compatibility with RDF.


  http://kcoyle.blogspot.com/2013/10/dublin-core-usage-in-lod.html
http://kcoyle.blogspot.com/2013/10/who-uses-dublin-core-dcterms.html
http://kcoyle.blogspot.com/2013/10/who-uses-dublin-core-original-15.html



2. Thomas Jefferson is a male person

http://www.w3.org/1999/02/22-rdf-syntax-ns#";
   xmlns:foaf="http://xmlns.com/foaf/0.1/";>

   http://id.loc.gov/authorities/names/n7908995";>
 
   




Using no additional vocabularies (ontologies), I think my hypothetical Linked 
Data spider / robot ought to be able to assert the following:

3. The Declaration of Independence was written by Thomas Jefferson, a male 
person

http://www.w3.org/1999/02/22-rdf-syntax-ns#";
  xmlns:dc="http://purl.org/dc/elements/1.1/";
  xmlns:foaf="http://xmlns.com/foaf/0.1/";>

   http://www.archives.gov/exhibits/charters/declaration_transcript.html";>
   
 http://id.loc.gov/authorities/names/n79089957";>
   male
 
   
   



The W3C Validator…validates Assertion #3, and returns the attached graph, which 
illustrates the logical combination of Assertion #1 and #2.

This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using a 
strict syntax, specifically a strict syntax of triples. It is very difficult 
for humans to articulate knowledge, let alone codifying it. How realistic is 
the idea of the Semantic Web? I wonder this not because I don’t think the 
technology can handle the problem. I say this because I think people can’t (or 
have great difficulty) succinctly articulating knowledge. Or maybe knowledge 
does not fit into triples?


I agree that it is hard, although it gets easier as you lose some of 
your current data processing baggage and begin to think more in terms of 
triples. For that, like Ross, I really advise you not to do your work in 
RDF/XML -- in a sense RDF/XML is a kluge to force RDF into XML, and it 
is much more complex than RDF in turtle or plain triples.


I also agree that not all knowledge may fit nicely into triples. RDF is 
great for articulations of things and relationships. Your example here 
is a perfect one for RDF. In fact, it is very simple conceptually and 
could be quite simple as triples. Conceptually you are saying:


URI:DoI -> dct:creator -> URI:TJeff
URI:Tjeff -> RDF:type -> foaf:Person
URI:Tjeff -> foaf:gender -> "male"
  

I've experimented a bit with using iPython (with Notebook) and the 
python rdflib, which can create a virtual triple-store that you can 
query against:

  http://www.rdflib.net/

Again, it's all soo much easier if you don't use rdfxml.

kc



—
Eric Morgan
University of Notre Dame

[cid:6A4E613F-CE41-4D35-BDFA-2E66EE7AF20A]




--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Eric Lease Morgan
I am of two minds when it comes to Linked Data and the Semantic Web.

Libraries and many other professions have been encoding things for a long time, 
but encoding the description of a book (MARC) or marking up texts (TEI), is not 
the same as encoding knowledge — a goal of the Semantic Web. The former is a 
process of enhancing — the adding of metadata — to an existing object. The 
later is a process of making assertions of truth. And in the case of the 
former, look at all the variations of describing a book, and think of all the 
different ways a person can mark up a text. We can’t agree.

In general, people do not think very systematically nor very logically. We are 
humans full of ambiguity, feelings, and perceptions. We are more animal than we 
are computer. We are more heart than we are mind. We are more like Leonard 
McCoy and less like Spock. Listen to people talk. Quite frequently we do not 
speak in complete sentences, and complete “sentences” are at the heart of the 
Linked Data and the Semantic Web. Think how much we rely on body language to 
convey ideas. If we — as a whole — have this difficulty, then how can we expect 
to capture and encode data, information, and knowledge with the rigor that a 
computer requires, no matter how many front-ends and layers are inserted 
between us and the triples?

Don’t get me wrong. I am of two minds when it comes to Linked Data and the 
Semantic Web. On one hand I believe the technology (think triples) is a descent 
fit and reasonable way to represent data, information, and knowledge. Heck I’m 
writing a book on the subject with examples of how to accomplish this goal. I 
am sincerely not threatened by this technology, nor do any of the RDF 
serializations get in my way. On the other hand, I just as sincerely wonder if 
the majority of people can manifest the rigor required by truly stupid and 
unforgiving computers to articulate knowledge.

—
Eric “Spoken Like A Humanist And Less Like A Computer Scientist” Morgan
University of Notre Dame


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Ross Singer
Eric,

I can't help but think that part of your problem is that you're using
RDF/XML, which definitely makes it harder to understand and visualize the
data model.

It might help if you switched to an RDF native serialization, like Turtle,
which definitely helps with regards to "seeing" RDF.

-Ross.
On Nov 4, 2013 6:29 AM, "Ross Singer"  wrote:

> And yet for the last 50 years they've been creating MARC?
>
> For the last 20, they've been making EAD, TEI, etc?
>
> As with any of these, there is an expectation that end users will not be
> hand rolling machine readable serializations, but inputting into
> interfaces.
>
> That is not to say there aren't headaches with RDF (there is no assumption
> of order of triples, for example), but associating properties with entity
> in which they actually belong, I would argue, is its real strength.
>
> -Ross.
> On Nov 3, 2013 10:30 PM, "Eric Lease Morgan"  wrote:
>
>> On Nov 3, 2013, at 6:07 PM, Robert Sanderson  wrote:
>>
>> > And it's not very hard given the right mindset -- its just a fully
>> expanded
>> > relational database, where the identifiers are URIs.  Yes, it's not 1st
>> > year computer science, but it is 2nd or 3rd year rather than post
>> graduate.
>>
>> Okay, granted, but how many people do we know who can draw an entity
>> relationship diagram? In other words, how many people can represent
>> knowledge as a relational database? Very few people in Library Land are
>> able to get past flat files, let alone relational databases. Yet we are
>> hoping to build the Semantic Web where everybody can contribute. I think
>> this is a challenge.
>>
>> Don’t get me wrong. I think this is a good thing to give a whirl, but I
>> think it is hard.
>>
>> —
>> ELM
>>
>


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Ross Singer
And yet for the last 50 years they've been creating MARC?

For the last 20, they've been making EAD, TEI, etc?

As with any of these, there is an expectation that end users will not be
hand rolling machine readable serializations, but inputting into
interfaces.

That is not to say there aren't headaches with RDF (there is no assumption
of order of triples, for example), but associating properties with entity
in which they actually belong, I would argue, is its real strength.

-Ross.
On Nov 3, 2013 10:30 PM, "Eric Lease Morgan"  wrote:

> On Nov 3, 2013, at 6:07 PM, Robert Sanderson  wrote:
>
> > And it's not very hard given the right mindset -- its just a fully
> expanded
> > relational database, where the identifiers are URIs.  Yes, it's not 1st
> > year computer science, but it is 2nd or 3rd year rather than post
> graduate.
>
> Okay, granted, but how many people do we know who can draw an entity
> relationship diagram? In other words, how many people can represent
> knowledge as a relational database? Very few people in Library Land are
> able to get past flat files, let alone relational databases. Yet we are
> hoping to build the Semantic Web where everybody can contribute. I think
> this is a challenge.
>
> Don’t get me wrong. I think this is a good thing to give a whirl, but I
> think it is hard.
>
> —
> ELM
>


Re: [CODE4LIB] rdf serialization

2013-11-03 Thread Eric Lease Morgan
On Nov 3, 2013, at 6:07 PM, Robert Sanderson  wrote:

> And it's not very hard given the right mindset -- its just a fully expanded
> relational database, where the identifiers are URIs.  Yes, it's not 1st
> year computer science, but it is 2nd or 3rd year rather than post graduate.

Okay, granted, but how many people do we know who can draw an entity 
relationship diagram? In other words, how many people can represent knowledge 
as a relational database? Very few people in Library Land are able to get past 
flat files, let alone relational databases. Yet we are hoping to build the 
Semantic Web where everybody can contribute. I think this is a challenge.

Don’t get me wrong. I think this is a good thing to give a whirl, but I think 
it is hard.

—
ELM


Re: [CODE4LIB] rdf serialization

2013-11-03 Thread Eric Lease Morgan
On Nov 3, 2013, at 6:07 PM, Robert Sanderson  wrote:

> Currently your assertion is that the creator /of a web page/ is Jefferson,
> which is clearly false.
> 
> The page (...) is a transcription of the Declaration of Independence.
> The Declaration of Independence is written by Jefferson.
> Jefferson is Male.

Okay. ‘Makes sense, but let’s find a URI for THE Declaration Of Independence — 
that thing under glass in the National Archives. —ELM


Re: [CODE4LIB] rdf serialization

2013-11-03 Thread Robert Sanderson
You're still missing a vital step.

Currently your assertion is that the creator /of a web page/ is Jefferson,
which is clearly false.

The page (...) is a transcription of the Declaration of Independence.
The Declaration of Independence is written by Jefferson.
Jefferson is Male.

And it's not very hard given the right mindset -- its just a fully expanded
relational database, where the identifiers are URIs.  Yes, it's not 1st
year computer science, but it is 2nd or 3rd year rather than post graduate.

Which is not to say that people do not have great trouble succinctly
articulating knowledge, but like any skill, it can be learned. Just look at
the variation in the ways of writing papers ... some people can do it very
clearly, some have much more difficulty.

And with JSON-LD, you don't have to understand the RDF, just a clean
representation of it.

Rob



On Sun, Nov 3, 2013 at 1:45 PM, Eric Lease Morgan  wrote:

>
> Cool input. Thank you. I believe I have tweaked my assertions:
>
> 1. The Declaration of Independence was written by Thomas Jefferson
>
>xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
>   xmlns:dc="http://purl.org/dc/elements/1.1/"; >
>
>  rdf:about="
> http://www.archives.gov/exhibits/charters/declaration_transcript.html";>
> http://id.loc.gov/authorities/names/n79089957
>   
>
> 
>
>
> 2. Thomas Jefferson is a male person
>
>xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
>   xmlns:foaf="http://xmlns.com/foaf/0.1/";>
>
>   http://id.loc.gov/authorities/names/n7908995
> ">
> 
>   
>
> 
>
>
> Using no additional vocabularies (ontologies), I think my hypothetical
> Linked Data spider / robot ought to be able to assert the following:
>
> 3. The Declaration of Independence was written by Thomas Jefferson, a male
> person
>
>   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
>  xmlns:dc="http://purl.org/dc/elements/1.1/";
>  xmlns:foaf="http://xmlns.com/foaf/0.1/";>
>
>  rdf:about="
> http://www.archives.gov/exhibits/charters/declaration_transcript.html";>
>   
> http://id.loc.gov/authorities/names/n79089957";>
>   male
> 
>   
>   
>
> 
>
> The W3C Validator…validates Assertion #3, and returns the attached graph,
> which illustrates the logical combination of Assertion #1 and #2.
>
> This is hard. The Semantic Web (and RDF) attempt at codifying knowledge
> using a strict syntax, specifically a strict syntax of triples. It is very
> difficult for humans to articulate knowledge, let alone codifying it. How
> realistic is the idea of the Semantic Web? I wonder this not because I
> don’t think the technology can handle the problem. I say this because I
> think people can’t (or have great difficulty) succinctly articulating
> knowledge. Or maybe knowledge does not fit into triples?
>
> —
> Eric Morgan
> University of Notre Dame
>
> [cid:6A4E613F-CE41-4D35-BDFA-2E66EE7AF20A]
>
>


Re: [CODE4LIB] rdf serialization

2013-11-03 Thread Eric Lease Morgan

Cool input. Thank you. I believe I have tweaked my assertions:

1. The Declaration of Independence was written by Thomas Jefferson

http://www.w3.org/1999/02/22-rdf-syntax-ns#";
  xmlns:dc="http://purl.org/dc/elements/1.1/"; >

  http://www.archives.gov/exhibits/charters/declaration_transcript.html";>
http://id.loc.gov/authorities/names/n79089957
  




2. Thomas Jefferson is a male person

http://www.w3.org/1999/02/22-rdf-syntax-ns#";
  xmlns:foaf="http://xmlns.com/foaf/0.1/";>

  http://id.loc.gov/authorities/names/n7908995";>

  




Using no additional vocabularies (ontologies), I think my hypothetical Linked 
Data spider / robot ought to be able to assert the following:

3. The Declaration of Independence was written by Thomas Jefferson, a male 
person

http://www.w3.org/1999/02/22-rdf-syntax-ns#";
 xmlns:dc="http://purl.org/dc/elements/1.1/";
 xmlns:foaf="http://xmlns.com/foaf/0.1/";>

  http://www.archives.gov/exhibits/charters/declaration_transcript.html";>
  
http://id.loc.gov/authorities/names/n79089957";>
  male

  
  



The W3C Validator…validates Assertion #3, and returns the attached graph, which 
illustrates the logical combination of Assertion #1 and #2.

This is hard. The Semantic Web (and RDF) attempt at codifying knowledge using a 
strict syntax, specifically a strict syntax of triples. It is very difficult 
for humans to articulate knowledge, let alone codifying it. How realistic is 
the idea of the Semantic Web? I wonder this not because I don’t think the 
technology can handle the problem. I say this because I think people can’t (or 
have great difficulty) succinctly articulating knowledge. Or maybe knowledge 
does not fit into triples?

—
Eric Morgan
University of Notre Dame

[cid:6A4E613F-CE41-4D35-BDFA-2E66EE7AF20A]

<>

Re: [CODE4LIB] rdf serialization

2013-11-03 Thread Aaron Rubinstein
Hi Eric, 

Complex ideas that span multiple triples are often expressed through SPARQL. In 
other words, you store a soup of triple statements and the SPARQL query 
traverses the triples and presents the resulting information in a variety of 
formats, much in the same way you’d query a database using JOINs and present 
the resulting data on a single Web page.

Using your graph, this SPARQL query should return the work and the gender of 
the work's creator:

PREFIX dc: 
PREFIX foaf: 
SELECT ?work ?gender
WHERE {
?work dc:created ?creator .
?creator foaf:gender ?gender .
}


If you want to explicitly state that the Declaration of Independence was 
written by a male, you would need a predicate that’s set up to do that, 
something that takes a work as its domain and has a range of a gender. It would 
also help to have a class for gender. That way, you could have a triple 
statement like this:


foaf:name “Thomas Jefferson”
a :Male .

and you could infer that if:


dc:creator  .

The creator of the Declaration is of class :Male:


:createdByGender :Male   

All the best, 

Aaron Rubinstein






On Nov 3, 2013, at 12:00 AM, Eric Lease Morgan  wrote:

> 
> How can I write an RDF serialization enabling me to express the fact that the 
> United States Declaration Of Independence was written by Thomas Jefferson and 
> Thomas Jefferson was a male? (And thus asserting that the Declaration of 
> Independence was written by a male.)
> 
> Suppose I have the following assertion:
> 
>  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
>xmlns:dc="http://purl.org/dc/elements/1.1/"; >
> 
>
>
> rdf:about="http://www.archives.gov/exhibits/charters/declaration_transcript.html";>
>  
> http://www.worldcat.org/identities/lccn-n79-89957
>
> 
>  
> 
> Suppose I have a second assertion:
> 
>  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
>xmlns:foaf="http://xmlns.com/foaf/0.1/";>
> 
>
> rdf:about="http://www.worldcat.org/identities/lccn-n79-89957";>
>  male
>
> 
>  
> 
> Now suppose a cool Linked Data robot came along and harvested my RDF/XML. 
> Moreover lets assume the robot could make the logical conclusion that the 
> Declaration was written by a male. How might the robot express this fact in 
> RDF/XML? The following is my first attempt at such an expression, but the 
> resulting graph (attached) doesn't seem to visually express what I really 
> want:
> 
>   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#”
>  xmlns:foaf="http://xmlns.com/foaf/0.1/“
>  xmlns:dc="http://purl.org/dc/elements/1.1/“>
> 
>   rdf:about="http://www.worldcat.org/identities/lccn-n79-89957";>
>male
>  
> 
>
> rdf:about="http://www.archives.gov/exhibits/charters/declaration_transcript.html";>
>http://www.worldcat.org/identities/lccn-n79-89957
>  
> 
> 
> Am I doing something wrong? How might you encode such the following 
> expression — The Declaration Of Independence was authored by Thomas 
> Jefferson, and Thomas Jefferson was a male. And therefore, the Declaration Of 
> Independence was authored by a male named Thomas Jefferson? Maybe RDF can not 
> express this fact because it requires two predicates in a single expression, 
> and this the expression would not be a triple but rather a “quadrile" — 
> object, predicate #1, subject/object, predicate #2, and subject?
> 
> 
> —
> Eric Morgan
> 
> [cid:2A12C96F-E5C4-4C77-999C-B7FF5C2FA171@att.net]
> 
> 


[CODE4LIB] rdf serialization

2013-11-02 Thread Eric Lease Morgan

How can I write an RDF serialization enabling me to express the fact that the 
United States Declaration Of Independence was written by Thomas Jefferson and 
Thomas Jefferson was a male? (And thus asserting that the Declaration of 
Independence was written by a male.)

Suppose I have the following assertion:

  http://www.w3.org/1999/02/22-rdf-syntax-ns#";
xmlns:dc="http://purl.org/dc/elements/1.1/"; >


http://www.archives.gov/exhibits/charters/declaration_transcript.html";>
  
http://www.worldcat.org/identities/lccn-n79-89957


  

Suppose I have a second assertion:

  http://www.w3.org/1999/02/22-rdf-syntax-ns#";
xmlns:foaf="http://xmlns.com/foaf/0.1/";>


http://www.worldcat.org/identities/lccn-n79-89957";>
  male


  

Now suppose a cool Linked Data robot came along and harvested my RDF/XML. 
Moreover lets assume the robot could make the logical conclusion that the 
Declaration was written by a male. How might the robot express this fact in 
RDF/XML? The following is my first attempt at such an expression, but the 
resulting graph (attached) doesn't seem to visually express what I really want:

http://www.w3.org/1999/02/22-rdf-syntax-ns#”
  xmlns:foaf="http://xmlns.com/foaf/0.1/“
  xmlns:dc="http://purl.org/dc/elements/1.1/“>

  http://www.worldcat.org/identities/lccn-n79-89957";>
male
  

  http://www.archives.gov/exhibits/charters/declaration_transcript.html";>
http://www.worldcat.org/identities/lccn-n79-89957
  


Am I doing something wrong? How might you encode such the following expression 
— The Declaration Of Independence was authored by Thomas Jefferson, and Thomas 
Jefferson was a male. And therefore, the Declaration Of Independence was 
authored by a male named Thomas Jefferson? Maybe RDF can not express this fact 
because it requires two predicates in a single expression, and this the 
expression would not be a triple but rather a “quadrile" — object, predicate 
#1, subject/object, predicate #2, and subject?


—
Eric Morgan

[cid:2A12C96F-E5C4-4C77-999C-B7FF5C2FA171@att.net]


<>