Re: GLD CubeValidator

2016-07-26 Thread Dave Reynolds

Hi Jürgen,

On 26/07/16 10:06, Jürgen Jakobitsch wrote:

hi,

anyone feeling responsible for the DataCubeValidator [1]. it's not
available anymore.


If I recall correctly that was a W3C-supplied (virtual) machine intended 
just to support the last call process with no promise of continued 
service beyond that.


I still have the (ancient) code behind it so if W3C wanted to resurrect 
it then that would be possible from my point of view but with no working 
group there's no one to own such a thing.



note: i was writing to gld-comments to years back about another typo,
but didn't get an answer.


There have been some errata reported on the list for org but not for qb 
that I can see. Ah, looking at [2] I see it says "This mailing list is 
no longer active" so it's no longer allowing mail.


Surprising. I would have thought the official comments list for 
recommendations should stay open specifically for errata but I've been 
out of W3C processes for too long to appreciate the nuances.


Dave

[2] https://lists.w3.org/Archives/Public/public-gld-comments/



krj

[1] http://www.w3.org/2011/gld/validator/qb


*Jürgen Jakobitsch*
Innovation Director
Semantic Web Company GmbH
EU: +43-1-4021235 -0
Mobile: +43-676-6212710
http://www.semantic-web.at 
http://www.poolparty.biz 


PERSONAL INFORMATION
| web   : http://www.turnguard.com
| foaf  : http://www.turnguard.com/turnguard
| g+: https://plus.google.com/111233759991616358206/posts
| skype : jakobitsch-punkt
| xmlns:tg  = "http://www.turnguard.com/turnguard#;




Re: ORG implementations

2013-11-05 Thread Dave Reynolds

Hi Enrico,

Thanks very much for the report which I've recorded on our tracking page.

It's good to see use of the n-ary version of membership.

Dave

[1] http://www.w3.org/2011/gld/wiki/ORG_Implementations


On 05/11/13 09:31, Enrico Daga wrote:

Hello Dave,
the linked data of The Open University [1] is using the Org vocabulary,
in particular the following terms:
* Organization
* Membership
* hasMembership
* memberOf
* organization
as you can see we make use of both the binary membership as well as the
n-ary pattern.
This is the first time I see the two coexisting in the same vocabulary
and I believe personally this is a nice and flexible solution, which I
hope you won't change removing one or the other.

Many thanks to the WG for the nice contribution!

Cheers,
Enrico


[1] http://data.open.ac.uk


On 4 November 2013 15:21, Dave Reynolds dave.e.reyno...@gmail.com
mailto:dave.e.reyno...@gmail.com wrote:

The W3C GLD working group is discussing the status of the
Organization Ontology [1] which is currently a Candidate Recommendation.

While there is evidence of usage [2] we cannot yet meet the CR exit
criteria for all the terms in the ontology. The group will need to
decide within the next two weeks what to do about this (options
include alternative publication tracks).

If you have, or are planning, a use of the ORG ontology and have not
yet reported it to the working group then please could you do so
over the next two weeks letting us know at least which terms you
have used. If you plan a report but cannot make that timescale then
please contact us.

Dave

[1] http://www.w3.org/TR/vocab-__org/ http://www.w3.org/TR/vocab-org/
[2] http://www.w3.org/2011/gld/__wiki/ORG_Implementations
http://www.w3.org/2011/gld/wiki/ORG_Implementations




--
--
Enrico Daga
Project Officer - Linked Data
KMi - Knowledge Media Institute
The Open University
email: enrico.d...@open.ac.uk mailto:enrico.d...@open.ac.uk
skype: enri-pan
-- The Open University is incorporated by Royal Charter (RC 000391), an
exempt charity in England  Wales and a charity registered in Scotland
(SC 038302).





Re: Data Cube implementations

2013-11-04 Thread Dave Reynolds
It would particularly helpful to hear from anyone who is either using or 
planning to use the following parts of the vocabulary:


1. Cubes using a measure dimension
   - qb:measureDimension, qb:measureType

2. Non-SKOS hierarchies (e.g. geographic containment)
   - qb:HierarchicalCodeList, qb:parentChildProperty, qb:hierarchyRoot

Dave

On 31/10/13 16:08, Dave Reynolds wrote:

If anyone is using the RDF Data Cube Vocabulary [1] and hasn't yet
submitted an implementation report then please could I encourage you to
do so.

See [2] for details on how to do this or contact me.

Dave

[1] http://www.w3.org/TR/vocab-data-cube/

[2]
http://www.w3.org/2011/gld/wiki/How_to_submit_a_Data_Cube_Implementation_Report






ORG implementations

2013-11-04 Thread Dave Reynolds
The W3C GLD working group is discussing the status of the Organization 
Ontology [1] which is currently a Candidate Recommendation.


While there is evidence of usage [2] we cannot yet meet the CR exit 
criteria for all the terms in the ontology. The group will need to 
decide within the next two weeks what to do about this (options include 
alternative publication tracks).


If you have, or are planning, a use of the ORG ontology and have not yet 
reported it to the working group then please could you do so over the 
next two weeks letting us know at least which terms you have used. If 
you plan a report but cannot make that timescale then please contact us.


Dave

[1] http://www.w3.org/TR/vocab-org/
[2] http://www.w3.org/2011/gld/wiki/ORG_Implementations



Data Cube implementations

2013-10-31 Thread Dave Reynolds
If anyone is using the RDF Data Cube Vocabulary [1] and hasn't yet 
submitted an implementation report then please could I encourage you to 
do so.


See [2] for details on how to do this or contact me.

Dave

[1] http://www.w3.org/TR/vocab-data-cube/

[2] 
http://www.w3.org/2011/gld/wiki/How_to_submit_a_Data_Cube_Implementation_Report




Re: SPARQL results in RDF

2013-09-25 Thread Dave Reynolds

Hi Damian,

On 25/09/13 14:16, Damian Steer wrote:

On 25/09/13 12:03, Stuart Williams wrote:

On 25/09/2013 11:26, Hugh Glaser wrote:

You'll get me using CONSTRUCT soon :-)
(By the way, Tim's actual CONSTRUCT WHERE query isn't allowed because
of the FILTER).


Good catch... yes - I've been bitten by that kind of thing too... that
not all that's admissible in a WHERE 'body', is admissible in a
CONSTRUCT 'body'.


As far as I'm aware it is -- Tim's original simply misplaced a curly
brace. The filter ought to be in the WHERE body.

CONSTRUCT is essentially SELECT with a tabular data - rdf system bolted
at the end of the pipeline.


I think the point people were making is that the syntactic shortform 
CONSTRUCT WHERE with implicit template only applies when you have a 
simple basic graph pattern [1].


If the WHERE clause is more complex, e.g. with a FILTER, then you need 
an explicit construct template.


Dave

[1] http://www.w3.org/TR/sparql11-query/#constructWhere




Re: SPARQL results in RDF

2013-09-25 Thread Dave Reynolds

On 25/09/13 14:57, Sven R. Kunze wrote:

On 09/25/2013 03:53 PM, Dave Reynolds wrote:

Hi Damian,

On 25/09/13 14:16, Damian Steer wrote:

On 25/09/13 12:03, Stuart Williams wrote:

On 25/09/2013 11:26, Hugh Glaser wrote:

You'll get me using CONSTRUCT soon :-)
(By the way, Tim's actual CONSTRUCT WHERE query isn't allowed because
of the FILTER).


Good catch... yes - I've been bitten by that kind of thing too... that
not all that's admissible in a WHERE 'body', is admissible in a
CONSTRUCT 'body'.


As far as I'm aware it is -- Tim's original simply misplaced a curly
brace. The filter ought to be in the WHERE body.

CONSTRUCT is essentially SELECT with a tabular data - rdf system bolted
at the end of the pipeline.


I think the point people were making is that the syntactic shortform
CONSTRUCT WHERE with implicit template only applies when you have a
simple basic graph pattern [1].

If the WHERE clause is more complex, e.g. with a FILTER, then you need
an explicit construct template.

Dave

[1] http://www.w3.org/TR/sparql11-query/#constructWhere




How did you come to that conclusion?


Based on the part of the specification given by link [1] above, which 
says (my emphasis):


A short form for the CONSTRUCT query form is provided for the case 
where the template and the pattern are the same and the pattern is just 
a basic graph pattern **(no FILTERs and no complex graph patterns are 
allowed in the short form)**.


Dave




Re: Opinions sought: characterisation of relationships in terms of roles and events

2013-09-05 Thread Dave Reynolds
You may find the Membership n-ary relation in the Organization ontology 
[1] a useful starting point. That current represents person, 
organization, role bindings over a time interval. There's no reason you 
couldn't annotate that with location information as well.


Dave

[1] http://www.w3.org/TR/vocab-org/

On 04/09/13 15:17, Tanya Gray wrote:

Hello,

I am seeking feedback on a proposal for how to describe the context of a
relationship that exists between two entities, e.g. a person and an
organisation. Any thoughts on this proposal would be very welcome.

Thank you

Tanya

BACKGROUND:

Metadata exists that relates two entities and involves a role performed
by one of the entities, e.g.

http://example.org/id/workA http://example.org/vocab#hasAuthor
http://example.org/id/personA .

In this example the two entities are “work” and “person”, and the role
is “author”.

REQUIREMENT

There is a requirement to describe this relation in context, e.g. in
time, and space.

PROPOSAL

https://lh5.googleusercontent.com/qetw1fVnKGcBdT8mdynTgWKRHQt73gKh0RxjvhH9MpUWR_lP2tGfdbGiKbulPZmTZmBQR40gmOTwcwKypF9y6AIVyCFBSi2YRsjw_IxrdaqgZJJf61yskMlRAA

/Illustration of how a relationship between entities can be described in
terms of a role and an event and given context with event properties/

The proposal is:

·Define relations in terms of events and roles, and associate contextual
information with the event

·Identify events and define as classes

·Identify roles that exist for each event and define roles as
individuals and members of the class “Role”

·Define object properties for each role e.g. hasTeacher with a range of
“RoleInEvent”

·Identify the entities (besides roles) that are associated with an event
and define object properties for each event entity

·Define additional contextual information that are common to all events
as object properties, e.g. time, location, process, reason

·Define a class called “RoleInEvent” to link a role, a role player and
an event

Example RDF:

@prefix ludo: http://vocab.ox.ac.uk/ludo# http://vocab.ox.ac.uk/ludo

http://example.org/id/personA ludo:hasRoleInEvent
http://example.og/id/RoleInEventA .

http://example.org/id/RoleInEventA http://example.og/id/RoleInEventA [

 a ludo:RoleInEvent ;

 ludo:hasRole ludo:Employee ;

 ludo:hasEvent http://example.org/id/EmploymentA

 ludo:hasRolePlayer http://example.org/id/personA .

] ..

http://example.org/id/organisationA ludo:hasRoleInEvent
http://example.og/id/RoleInEventB .

http://example.org/id/RoleInEventB [

 a ludo:RoleInEvent ;

 ludo:hasRole ludo:Employer ;

 ludo:hasEvent http://example.org/id/EmploymentA ;

 ludo:hasRolePlayer http://example.org/id/organisationA .

] ..

http://example.org/id/EmploymentA [

  # type of event

 a ludo:Employment;

   # roles that exist for event

 ludo:hasEmployee http://example.org/id/RoleInEventA;

 ludo:hasEmployer http://example.org/id/RoleInEventB;

  # contextual information address when, how, where, why

 ludo:hasTime http://example.org/id/TimeA;

 ludo:hasProcess http://example.org/id/ProcessA;

 ludo:hasLocation http://example.org/id/LocationA;

 ludo:hasReason http://example.org/id/ReasonA;


] ..

https://lh4.googleusercontent.com/GATl8lCSetKCpPc0-zBfXohOHu3ARCikrJKyaR2g3n9RoYQLLllrSORZnbDfKXDuZHzF8kwgd2504TbOTjIqoP0wLsd_gZSaivbNOCZK3I_C5I7hSxwEWn3w6w

/Illustration of how to represent a relationship between two entities in
terms of a role and an event, using an intermediate class called
“RoleInEvent”/

https://lh5.googleusercontent.com/Zl-Fkt22jlqh8F8O21T47pZIwfvu8f2y9nThvBDHJ9D2RrNbcBLnuZabrRHQxHw6E7e1pKy3hdejxNEEZZrEM6ckhh25XWLuvrHJY36o-Y82gPghy54hCP70Nw

/Illustration of an object property (hasAuthor) defined for an authoring
event that links the event to RoleInEvent, a class that links a role, a
thing holding that role, and an event/






Re: Civic apps and Linked Data

2013-06-26 Thread Dave Reynolds
While one could debate the scale of impact I do see some successful 
civic apps here in the UK.


For example, the Environment Agency's (EA) publication of real time 
data on quality of water at bathing sites [1] has been quite successful.


As well as the nice explorer application provided with the data, 
sponsored directly by EA, a third party company came along and built a 
smartphone app for viewing the data, integrated with other information 
about the bathing sites.


We are finding that increasing numbers of local authorities are 
including the data via simple web widgets on their visitor web sites. 
Some have information screens at the beaches where visitors can see 
information on the water quality without having to have a smartphone.


I think the key here was:

1. Having data which is actually of interest in a civic context (how 
safe is it to go swimming?).


2. The publisher providing a compelling example of what can be done with 
the data.


3. Publisher commitment to maintaining the data up to date.

4. Having a simple API so that web developers could get initial access 
without needing to learn SPARQL or RDF. The Linked Data API [2] means 
that you can filter the data to get the part you want and have it 
delivered in easy to consume JSON or XML.


Arup, who developed the smartphone app, initially just used the API 
never having worked with RDF. Then from that gained an understanding of 
what this Linked Data stuff is all about. From there started to look at 
what other data sets could be linked into the bathing site reference data.


I feel it is repeating patterns like this that's the way to go. Build up 
value steadily by more and more data sets and more and more locally 
useful applications. None of them need be revolutionary. It is the 
incrementally growing value of the data ecosystem, which is a long game 
to play.  Don't hold out for we changed the world killer apps, just 
deliver value.


Dave

[1] http://environment.data.gov.uk/
[2] https://code.google.com/p/linked-data-api/

On 26/06/13 15:03, Alvaro Graves wrote:

Hello everyone,

A few days ago I attended ABRELATAM'13 an unconference focused on Open
Data in Latin America. I proposed a session about Open Data + Linked
Data to discuss how semantics and LOD in general can help government and
civi organizations. I want to share the main ideas that emerged from the
conversation:

- SW/LOD sounds really cool and the direction where thing should move.
- However there are many technical aspects that remain unsolved
- Since for many people having a relatively good solution using CSV,
JSON, etc. is easier, they don't want to use SW/LOD because it is an
overkill and too complicated.

So my question is: Why we don't see lots of civic apps using Linked
Data? Where are the SW activists? Why we haven't been able to
demonstrate to the hacker community the benefits of using semantic
technologies? Is it because they are hard to use? They don't scale well
in many cases (as a googled pointed out)? Are we too busy working in
academia/businesses?

I know very few civic apps using semantic technologies and I don't think
I have seen anyone that has made real impact in any country. I would
love if you can prove me wrong and if we can discuss how can we involve
more activists and hackers into the SW/LOD community.


Alvaro Graves-Fuenzalida
Web: http://graves.cl - Twitter: @alvarograves





Re: The Great Public Linked Data Use Case Register for Non-Technical End User Applications

2013-06-24 Thread Dave Reynolds

On 24/06/13 13:44, Kingsley Idehen wrote:


As you've indicated, there have been many attempts at this over the
years and they never take-off or meet their goals etc.. The problem is
that a different approach is required. Basically, in this scenario lies
a simple Linked Data publication usecase i.e., a problem that Linked
Data addresses.

The steps:

1. use a Linked Data document to describe you product, service,
platform, usecase
2. publish the document
3. make people aware of the document.

Crawlers will find your document. The content of the document will show
up in search results.


There is, of course, the W3C community directory [1] which works exactly 
that way. It has project rather than usecase, and might need some 
extensions to support the fields that Dominic was suggesting. But it 
does provide a form based way to generate the initial RDF for you to 
publish, does the crawling and they provides a UI over the crawl.



The trouble is ...


[complaints snipped]


People need to understand that scribbling is a natural Web pattern
i.e., rough cuts are okay since improvements will be continuous.


Reusing patterns does make it easier for tools to aggregate and present 
data. The perfect might be the enemy of the good, but sometimes a little 
effort to do things consistently is good.


Dave

[1] http://dir.w3.org/



Re: Percentages in Linked Data

2013-06-24 Thread Dave Reynolds

Hi Frans,

On 24/06/13 17:37, Frans Knibbe | Geodan wrote:

Hello,

I would like to publish some statistical data. A few of these numbers
are percentages. What is the best way to make it clear to data consumers
that the numbers are to be treated as percentages? As far as I can tell,
the XSD data types do not suffice.


QUDT does include percent as a unit:  http://qudt.org/vocab/unit#Percent

Assuming you are using the RDF Data Cube then you can use this as the 
value of unit of measure attribute (sdmx-attribute:unitMeasure).


If you are dealing with single measures or measure-dimension cubes that 
should be fine.


You are using multi-measure observations then life gets harder. In that 
case you might consider encoding your values as e.g. qudt:QuantityValues 
and attaching the unit of measure that way.


Dave





Re: Help with modeling my ontology

2013-02-28 Thread Dave Reynolds
Just on the question of representing measurements then one approach to 
that is the RDF Data Cube vocabulary [1]. In that each observation has a 
measure (the thing you are measuring, such as canopyHeight), the 
dimensions of where/when/etc the measurement applies to and the 
attributes that allow you to interpret the measurement.


So you would normally make the unit of measure an attribute.

If the method doesn't fundamentally change the nature of the thing you 
are measuring then you could make that another attribute.  If it does 
then you should have a different measure property for the different 
methods (possibly with some common super property).


Dave

[1] http://www.w3.org/TR/vocab-data-cube/

On 27/02/13 20:58, Luca Matteis wrote:

Hello all,

At http://www.cropontology.org/ I'm trying to make things a little more
RDF friendly. For example, we have an ontology about Groundnut here:
http://www.cropontology.org/ontology/CO_337/Groundnut/ttl

I'm generating this from a somewhat flat list of names/concepts, so it's
still a work in progress. But I'm having issues making sense of it all
so that the ontology can be used by people that actually have Groundnut
data.

For example, in that Turtle dump, search for Canopy height. This is a
concept that people might use to describe the height of the canopy of
their groundnut plant, as the comment describes (this should be a
Property not a Class, but like I said, it's still work-in-progress).
Let's try with some sample data someone might have about groundnut, and
see if I can further explain my issue (I assume co: is a prefix for my
cropontology.org http://cropontology.org site, also the URIs are
different but it's just an example):

 :groundnut1
   a co:Groundnut;
   co:canopyHeight xxx .

Ok here's the issue, we know that `canopyHeight` is measured using
different methodologies. For example it might be measured using a
methodology that we found to be described as Measuring the distance
from the base to the tip of the main stem, but it might also be some
other method. And, funny enough, we also realized that it is measured
using centimeters, with a minimum of 0 and a maximum of 10cm.

So how should I make this easier on the people that are using my
ontology? Should it be:

 :groundnut1
   a co:Groundnut;
   co:canopyHeight 9.5cm .

or should it be:

 :groundnut1
   a co:Groundnut;
   co:canopyHeight [
 co:method Measuring the distance from the base to the tip of
the main stem;
 co:scale 9.5cm
   ] .

Maybe I'm going about this the wrong way and should think more about how
this ontology is going to be used by people that have data about it...
but I'm not sure. Any advice would be great. And here's the actual
browsable list of concepts, in a tree sort of interface:
http://www.cropontology.org/terms/CO_337:039/

As you can see there's this kind of thing happening all over the
ontology where we have the Property-the method it was measured- and
finally the scale. Any help? Thanks!






Re: referencing a concept scheme as the code list of some referrer's property

2012-08-29 Thread Dave Reynolds

On 28/08/12 20:39, Antoine Isaac wrote:

Sorry, my owl:someValuesFrom should have been owl:allValuesFrom, I guess.


Actually I think owl:someValuesFrom is right though the easiest 
construct is owl:hasValue :


some:codeAConcept owl:equivalentClass  [
   owl:intersectionOf  ( skos:Concept
   [ rdf:typeowl:Restriction ;
 owl:onProperty  skos:inScheme ;
 owl:hasValue  some:codeAConceptScheme  ]
   )
  ] .

I agree with the rest of your comments.

Dave


Antoine



Hi Thomas, all,

I disagree with you on the fact that sub-classing skos:Concept would
solve all representation needs. There is data to be asserted at the
concept scheme-level, which would be inappropriately captured when
being directly attached to a class of concepts. E.g., the creator of a
concept scheme or the rights attached to it. These will stick very
badly a class in an OWL ontology that represent the concept scheme. At
least (OWL) class and annotation properties are created usually.

Also, concepts are not instances of a concept schemes. It is really
stretching the way concepts and vocabularies are viewed in the domain,
as already mentioned by others. Plus, doing this would also break the
fundamentally good pattern followed by SKOS: SKOS data can entirely
remain at the instance level (in OWL terms). When one ports a
thesaurus on the Semantic Web, one doesn't want to be forced to use
RDFS/OWL features in the data published.

Of course SKOS resources can be used to create OWL ontologies. But I
think it's better that this remains the business of an ontology
creator (the guy creating the property that admit values from a given
concept scheme) and not the business of a KOS publisher.

So yes I really dislike using some:codeA and some:codeB as Simon does
in [1]. I mean, having them is not bad, but having no concept scheme
that bothers me.
I would really prefer the approach that consists of (OWL) defining
some:codeAConcept
as
some:codeAConcept owl:equivalentClass [
owl:intersectionOf ( skos:Concept
[ rdf:type owl:Restriction ;
owl:onProperty skos:inScheme ;
owl:someValuesFrom [ owl:oneOf ( some:codeAConceptScheme )] ]
)
] .
and then you use some:codeAConcept as the range of your my:property1
and keep some:codeAScheme carrying its scheme-level data.

Or in fact, if you hate redundancy, you can create straight away your
new property as:
my:property1 rdfs:range [
owl:intersectionOf ( skos:Concept
[ rdf:type owl:Restriction ;
owl:onProperty skos:inScheme ;
owl:someValuesFrom [ owl:oneOf ( some:codeAConceptScheme )] ]
)
] .


I understand you may want a construct to directly relate a property to
a concept scheme to constrain its values. But then it is really about
adding some new (meta-modelling) feature which was not identified in
the SKOS requirements. And as said in my previous mail, for now I'd
rather leave it to initiatives (like DC) which address data modelling
at a deeper level.
And in fact such feature would still be a shorthand for something that
is possible using OWL out-of-the-box.

Best,

Antoine

PS: by the way, I also disagree on merging (license or more generally
any kind of provenance) data on the (voiD) dataset and data on a
ConceptScheme, as hinted in [2]. In the past a project I've worked for
has created a version of the RAMEAU vocabulary, based on a snapshot
from the official version [4]. I think it was a good thing that people
could make the difference between the real vocabulary and the
prototype dataset we had created. In other terms, keeping distinct the
thing that is represented from the dataset that represents it --
another good pattern to follow, I think!

[1] http://lists.w3.org/Archives/Public/public-lod/2012Aug/0060.html
[2] http://lists.w3.org/Archives/Public/public-lod/2012Aug/0063.html
[3] http://stitch.cs.vu.nl/rameau
[4] http://rameau.bnf.fr



Hi Antoine and CCs and everybody,

nice answer, and I'm glad you have detected my question in this
haystack.
I think I have to tell more about the context of this question.

We have a new RD project about Linking Open Environment Data [1].
Here we try to bring together Data Cubes (prefix qb:) [2], SKOS, DCAT,
VoID, etc.

In Data Cubes, dimension properties are defined having rdfs:range
skos:Concept + qb:codeList rdfs:range skos:ConceptScheme.
Fine so far.

We have also developed iQvoc [3] in the previous years following the
pattern described by SKOS in [4]: The notion of an individual SKOS
concept scheme corresponds roughly to the notion of an individual
thesaurus. The technical consequence has been (so far) serving a single
concept scheme per iQvoc instance, and we may link multiple concept
schemes by SKOS mapping properties between such instances.
Fine as well so far.

Now we have some quite large concept schemes, and a single dimension
property cannot refer to entire the concept scheme (= thesaurus) as its
value set, but only to a subset. So we have 

Re: referencing a concept scheme as the code list of some referrer's property

2012-08-23 Thread Dave Reynolds

[Apologies for continuing the cross-posting]

A pattern of using sub-classes of skos:Concept to denote a group of 
concepts (and thus be able to use rdfs:range in associated ontologies) 
is a good one. It is recommended best practice in data.gov.uk linked 
data work, for example.


This does not remove the value of having an explicit representation of a 
concept scheme available if you wish to use it. It gives a place to 
attach scheme level metadata such as license information. While you 
*can* attach such information to an owl:Class you end up having to use 
owl:AnnotationProperties which causes its own problems (though less so 
in OWL 2). Having an explicit concept scheme also signals intention.


So while you certainly could get away with a leaner skos with no 
skos:ConceptScheme I think the spec is stronger for having it and, like 
with any spec, you adopt best practice patterns to suit your 
circumstances and preferences.



skos:ConceptScheme priotises domain conventions over common and shared
(better: to be shared) RDFS/OWL patterns.


Disagree. The notion of an explicit collection is a common RDFS/OWL 
pattern and indeed the Linked Data Platform work seems to be partly 
about strengthening that.



I found something similar in the Data Structure Definition of Data
Cubes. Dave (cc) will understand, as we had some discussion about this
topic ;-)


Not wishing to repeat that discussion let me just summarize that Data 
Structure Definitions have clear value, that has been demonstrated in 
practice separate from a desire to be compatible with SDMX. The fact 
that is compatible with the way people in that domain think about the 
problem doesn't of itself make it a bad thing :)



Dave et al. is conciliatory with SDMX and weakens RDFS/OWL by this.


Disagree. Borrowing a modelling pattern from some domain doesn't weaken 
RDFS/OWL.


Dave




Re: referencing a concept scheme as the code list of some referrer's property

2012-08-23 Thread Dave Reynolds

On 23/08/12 10:22, Thomas Bandholtz wrote:

Am 23.08.2012 10:40, schrieb Dave Reynolds:

[Apologies for continuing the cross-posting]

A pattern of using sub-classes of skos:Concept to denote a group of
concepts (and thus be able to use rdfs:range in associated ontologies)
is a good one. It is recommended best practice in data.gov.uk linked
data work, for example.

Fine. Why then does Data Cube not make use of this?


??

Data Cube certainly *permits* you to use subsets of skos:Concept and to 
declare the rdfs:range of the Dimension and Measure properties, we've 
used this for publishing many environmental data Cubes.


What it doesn't do is *require you to use this style and only this 
style. It allows you to directly reference the ConceptScheme in keeping 
with SKOS as it is currently used.



Having an explicit concept scheme also signals intention.


My notion is that defining subclasses of skos:Concept expresses that you
want to create a concept scheme.


Not really, it's not an out-of-the-box way of thinking about RDFS/OWL 
but a pattern you could impose on top. In OWL a class is defined by its 
extension (set of values) whereas a ConceptScheme (or whatever you 
replace it by) has an independent existence, it's an individual which 
you want to talk about (for example to say that the scheme is mandated 
by a certain piece of legislation) separate from the enumeration of its 
members.



Tons of metadata properties (such as skos:pref/altLable or note) do not
define any expicit domain, so the domain is rdfs:Resource. Why not
attach them to a rdfs:Class or owl:Class, which are subclasses of
rdfs:Resource?


Well the relevant properties here are things like skos:topConcept which 
do have domain axioms but if you were changing the specs your could 
certainly remove those.


Assuming domain compatability then in RDFS and OWL/full you are indeed 
free to do this sort of annotation.


In OWL DL there are issues precisely because this is treating a class as 
if it were an individual. OWL 2 eases this with punning (ugh) and 
through the ability to at least declare the range of an 
owl:AnnotationProperty. Whether that easement is enough depends on the 
specifics of your situation, including your tool chain.


You might not personally care about OWL DL but some users of SKOS 
(including Simon, I believe) do have to.


Dave




Re: Linked Data Demand Discussion Culture on this List, WAS: Introducing Semgel, a semantic database app for gathering analyzing data from websites

2012-07-20 Thread Dave Reynolds

Hi Sebastian,

I completely agree with what you say about:
  o Harish's original post being relevant to linked data and this list
  o that the culture of this forum can be counter productive
  o that the evidence for linked data delivering business value needs
to be a lot stronger

However, just to balance the picture slightly ...

There are *some* clear, well documented examples of semweb/RDF/LD 
delivering business value through data integration. The most famous of 
these being probably: Garlik (now Experian), Amdocs and arguably the 
BBC. In my experience for every publicised example there are several 
non-public or at least less visible examples of companies quietly using 
the technology internally while not shouting about it. I've come across 
examples in banking, publishing, travel and health care - at different 
levels of maturity.


Not saying the business value story is perfectly articulated or the 
evidence is watertight, but it's not totally absent :)


While it's not your main point, I would also say we have reasonable 
arguments for the value of linked data over just CSVs for publishing 
government statistics and measurement data. The benefits include safer 
use of data because it's self-describing (e.g. units!), ability to slice 
and dice through API calls making it easier to build apps, ability to 
address the data and thus annotate it and reference it. The more 
advanced government departments approach this as publish once, use 
many. One pipeline that lets people access the data as dumps, through 
REST APIs, as Linked Data or via apps - all powered by a shared Linked 
Data infra-structure. It's not CSV or Linked Data it's CSV *and* Linked 
Data.


Dave

On 20/07/12 16:48, Sebastian Schaffert wrote:

Kingsley,

I am trying to respond to your factual arguments inline. But let me first point out that the 
central problem for me is exactly what Mike pointed out: In your enthusiasm and cheerleading 
you as often turn people off as inspire them. You too frequently take it upon yourself to 
speak for the community. Semgel is a nice contribution being contributed by a new, 
enthusiastic contributor. I think this is to be applauded, not lectured or scolded. Semgel is 
certainly as much on topic as most of the posts to this forum.

The message you should hear is that many people are frustrated by the way the 
discussions in this forum are carried out and have already stopped contributing 
or even reading. And this is a very bad development for a community. The topic 
we are discussing right now is only a symptom. Please think about it.

Am 20.07.2012 um 16:43 schrieb Kingsley Idehen:


On 7/20/12 4:06 AM, Sebastian Schaffert wrote:

Am 19.07.2012 um 20:50 schrieb Kingsley Idehen:


I completely understand and appreciate your desire (which I share) to see a 
mature landscape with a range of linked data sources. I can also understand how 
a database or spreadsheet can potentially offer fine-grained data access - your 
examples do illustrate the point very well indeed!

However, if we want to build a sustainable business, the decision to build 
these features needs to be demand driven.

I disagree.
Note, I responded because I assumed this was a new Linked Data service. But it 
clearly isn't. Thus, I don't want to open up a debate about Linked Data virtues 
if you incorrectly assume they should be *demand driven*.

Remember, this is the Linked Open Data (LOD) forum. We've long past the issue 
of *demand driven* over here, re. Linked Data.

But I agree. A technology that is not able to fire proof its usefulness in a 
demand driven / problem driven environment is maybe interesting from an 
academic standpoint but otherwise not really useful.


So are you claiming that Linked Data hasn't fire proofed its usefulness in a 
demand drive / problem driven environment?



Indeed. This is my right as much as yours is to claim the opposite.

My claim is founded in the many discussions I have when going to the CTOs of 
*real* companies (big ones, outside the research business) out there and trying 
to convince them that they should build on Semantic Web technologies (because I 
believe they are superior). Believe me, even though I strongly believe in the 
technology, this is a very tough job without a good reference example that 
convinces them they will save X millions of Euros or improve the life or their 
employees or the society in the short- to medium term.

Random sample answer from this week (I could bring many): So this Linked Data is a 
possibility for data integration. Tell me, why should I convince my engineers to throw 
away their proven integration solutions? Why is Linked Data so superior to existing 
solutions? Where is it already in enterprise use?.

The big datasets always sold as a success story in the Linked Data Cloud are 
largely irrelevant to businesses:
- they are mostly dealing with internal data (projects, people, CRM, ERP, 
documents, CMS, …) where you won't find information in the 

Re: Datatypes with no (cool) URI

2012-04-03 Thread Dave Reynolds

On 03/04/12 16:38, Sarven Capadisli wrote:

On 12-04-03 02:33 PM, Phil Archer wrote:

I'm hoping for a bit of advice and rather than talk in the usual generic
terms I'll use the actual example I'm working on.

I want to define the best way to record a person's sex (this is related
to the W3C GLD WG's forthcoming spec on describing a Person [1]). To
encourage interoperability, we want people to use a controlled
vocabulary and there are several that cover this topic.

ISO 5218 has:
0 = not known;
1 = male;
2 = female;
9 = not applicable.

and Eurostat offers
F = female
M = male
OTH = other
UNK = unknown
NAP = not applicable

IMO, the spec should not dictate which one to use (there are others too
of course). What I *do* want to do though is to encourage publishers to
state which vocabulary they're using. Sounds like a job for a datatype -
and for that you need a URI for the vocabulary. Something like:

schema:gender 1^^http://iso.org/5218/ .

Except I made that iso.org URI up. The actual URI for it is
http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266

(or rather, that's the page about the spec but that's a side issue for
now).

That URI is just horrible and certainly not a 'cool URI'. The Eurostat
one is no better.

Does the datatype URI have to resolve to anything (in theory no, but in
practice? Would a URN be appropriate?

Given that the identifier for the ISO standard is ISO/IEC 5218:2004
how about urn:iso/iec:5218:2005?

For Eurostat, the internal identifier for the vocabulary is SCL - Sex
(standard code list) so would urn:eurostat:scl:sex be appropriate?

Anyone done anything like this in the real world?

All advice gratefully received.

Thank you

Phil.


[1] https://dvcs.w3.org/hg/gld/raw-file/default/people/index.html



Perhaps I'm looking at your problem the wrong way, but have you looked
at the SDMX Concepts:

http://purl.org/linked-data/sdmx/2009/code#sex

-Sarven



I was going to suggest that :)

Actually looking at that I see that I've failed to datatype the 
skos:notation entries in those code lists. There should probably be a 
http://purl.org/linked-data/sdmx/2009/code#sexDT datatype to go with the 
notation on those skos:Concepts.


Phil, if that's important to you then raise it as an issue on the 
tracker [1] and, if no one objects, then I can get it fixed.


Dave

[1] http://code.google.com/p/publishing-statistical-data/issues/list




Re: Thought: 207 Description Follows

2012-03-28 Thread Dave Reynolds

On 28/03/12 14:50, Kjetil Kjernsmo wrote:

On Wednesday 28. March 2012 14.37.42 Jeni Tennison wrote:

I don't think it's web hosters who would find it hard to deploy, rather that
people who just want to publish some data on some tiny patch of web space
that they own, often actually run by outsourced IT departments, do not
typically have access to either the software running on the servers (to
upgrade it) or to the configuration files that would enable them to change
status codes (or add headers for that matter).


Oh, we're talking about the same people! Web hosters, may be companies that
offer web hosting, typically on rather constrained environments, or IT
departments, again with the type of constraints you mention. This software
isn't static, it is usually upgraded in a cycle of 3-4 years, so in those
years, we can get our code in there.


I, at least, am not talking about web hosters even in those indirect guises.

The people who are (in my experience) putting time, effort and money 
into getting linked data published are generally in the line of 
business and may have little or no direct influence over the web hosters.


For example, in UK local government then even getting a static file 
published on a web site is very tricky if the file type isn't on the 
list of acceptable file types for that organization.


Another example, as Michael said (talking about a slightly different 
group of people) some publishers are not allowed to touch anything in 
the head section of their HTML.


This particular piece of the puzzle is not a technology or tools issue. 
The web hosting in those cases is perfectly capable of publishing static 
files or allowing content in the head of an html document. It is an 
organizational and social issue.


Dave


So, the key here is to understand how our software gets into those servers, so
it is there to begin with. So that there is no strange tweaking of config files,
no user-supplied packages. How can we make thousands of such companies
advertise that they host linked data, like they advertise that you can use
MySQL, memcached, or nginx. *That's* what we have to do.

Kjetil






Re: Thought: 207 Description Follows

2012-03-28 Thread Dave Reynolds

On 28/03/12 17:07, Kjetil Kjernsmo wrote:

Wed 28 mars 2012 16:35, Dave Reynolds wrote:

This particular piece of the puzzle is not a technology or tools issue.
The web hosting in those cases is perfectly capable of publishing static
files or allowing content in the head of an html document. It is an
organizational and social issue.


I have no doubt this is often the case, but frankly, I'm not interested in
those cases. They will always lag a decade behind, and the rest of the
world would have to provide a compelling case for why they need to change
their deeply ingrained practices.


It's not a question of lag, there are perfectly valid reasons for such 
constraints. It's a question of helping the people doing the publishing 
to be able to do so, despite such constraints.



I think the focus should be on those, not on organisational practices that
are hard to change anyway.


I didn't mean to imply we should should try to change organizational 
practices, no way. I meant that we need technical approaches which 
enable publishers to succeed *despite* such constraints. Not having to 
worry about 303s would be one useful step in that direction. It is not a 
magic bullet (after all, in many such situations a hash-URI is a 
workable alternative).


Dave




Re: Change Proposal for HttpRange-14

2012-03-26 Thread Dave Reynolds

On 25/03/12 19:24, Kingsley Idehen wrote:


Tim,

Alternatively, why not use the existing Link: header? Then we end up
with the ability to express the same :describedby relation in three
places


Which is, of course, in the now-submitted proposal.

Dave



Re: Change Proposal for HttpRange-14

2012-03-24 Thread Dave Reynolds

On 24/03/12 13:57, Jonathan A Rees wrote:

On Sat, Mar 24, 2012 at 7:17 AM, Jeni Tennisonj...@jenitennison.com  wrote:


Where well-behaved sites will have to make a decision is whether to continue to 
use a 303 or switch to using a 200 and including a 'describedby' relationship. 
For example, we at legislation.gov.uk might be seriously tempted to switch to 
returning 200s from /id/ URIs. Currently, anyone requesting an /id/ places a 
load on our origin server because the CDN can't cache the 303 response, so we 
try to avoid using them in links on our site even where we could (and really 
should). Consequently people referring to legislation don't use the /id/ URIs 
when what they are referring to is the legislation item, not a particular 
version of it. If we switched to a 200, we wouldn't have to avoid those URIs, 
which would in turn help us embed RDFa in our pages, because instead of having 
a reference in a footnote contain something like: [...]


Sorry to be a broken record here, I must be really not be hearing what
everyone is saying, but why don't you just use hash URIs? (Using the
#it or #_ indirection pattern if necessary.) This is the received
wisdom from the original semweb design, and they don't have any of the
problems that 303s or 200s do.


I can't speak for legislation.data.gov.uk.

However, as a long-time advocate of the just use hash URIs school then 
there are a couple of problems that convince me they are not convenient 
for all cases.


In the case where your data is, or could reasonably be, organized as a 
manageable bunch of documents then hash URIs are perfect. Small to 
moderately size ontologies are a canonical example of this.


In more complex situations they are less perfect.

Problem 1: fragility of default fragment assumption

In many systems we wish to serve up linked data from a triple store. A 
request for information on U comes in, we return DESCRIBE U. However, 
if U is a hash URI U'#F then, of course, all we see is U' and we need to 
reconstruct the F. For data totally under our own control we can adopt 
some standard fragment like 'it' and respond to U' with the result of 
DESCRIBE U'#it. However, the data is not always under our complete 
control and there is no universal agreement on what default fragment to 
use. Leaving us either having to maintain mapping tables or try multiple 
probes (when asked for U try U then try U#id then try ...). Not a 
fatal problem but certainly an inconvenience when managing large and 
complex stores.


Problem 2: serialization

With a convention of a single standard fragment then prefix notation in 
Turtle and qname notation in RDF/XML become unusable. You would have to 
have a separate prefix/namespace for each resource. In turtle you can 
just write out all URIs in full, inconvenient for not fatal. In RDF/XML 
you can do that for subjects/objects but not for properties (and not for 
classes if you want to use abbreviated syntax). Having to declare a new 
prefix for every property, and maybe every class, in a large ontology 
just so it can be serialized is a non-starter.


Dave





Re: Change Proposal for HttpRange-14

2012-03-23 Thread Dave Reynolds

On 23/03/12 14:33, Pat Hayes wrote:


On Mar 23, 2012, at 8:52 AM, Jonathan A Rees wrote:


I am a bit dismayed that nobody seems to be picking up on the point
I've been hammering on (TimBL and others have also pointed it out),
that, as shown by the Flickr and Jamendo examples, the real issue is
not an IR/NIR type distinction, but rather a distinction in the
*manner* in which a URI gets its meaning, via instantiation (of some
generic IR) on the one hand, vs. description (of *any* resource,
perhaps even an IR) on the other. The whole
information-resource-as-type issue is a total red herring, perhaps the
most destructive mistake made by the httpRange-14 resolution.


+1000. There is no need for anyone to even talk about information resources. 
The important point about http-range-14, which unfortunately it itself does not make 
clear, is that the 200-level code is a signal that the URI *denotes* whatever it 
*accesses* via the HTTP internet architecture.


Quite, and this signal is what the change proposal rejects.

The proposal is that URI X denotes what the publisher of X says it 
denotes, whether it returns 200 or not.


In those cases where you want a separate URI Xrdf to denote the 
document containing the steaming pile of RDF triples describing X then 
(in addition to use of 303s) you have the option to include


 X wdr:describedby Xrdf .

Thus if X denotes a book then you can describe the license for the book 
and the license for the description of the book separately.


Dave



Re: Change Proposal for HttpRange-14

2012-03-23 Thread Dave Reynolds

On 23/03/12 15:40, Kingsley Idehen wrote:

On 3/23/12 10:59 AM, Dave Reynolds wrote:

On 23/03/12 14:33, Pat Hayes wrote:


On Mar 23, 2012, at 8:52 AM, Jonathan A Rees wrote:


I am a bit dismayed that nobody seems to be picking up on the point
I've been hammering on (TimBL and others have also pointed it out),
that, as shown by the Flickr and Jamendo examples, the real issue is
not an IR/NIR type distinction, but rather a distinction in the
*manner* in which a URI gets its meaning, via instantiation (of some
generic IR) on the one hand, vs. description (of *any* resource,
perhaps even an IR) on the other. The whole
information-resource-as-type issue is a total red herring, perhaps the
most destructive mistake made by the httpRange-14 resolution.


+1000. There is no need for anyone to even talk about information
resources. The important point about http-range-14, which
unfortunately it itself does not make clear, is that the 200-level
code is a signal that the URI *denotes* whatever it *accesses* via
the HTTP internet architecture.


Quite, and this signal is what the change proposal rejects.

The proposal is that URI X denotes what the publisher of X says it
denotes, whether it returns 200 or not.

In those cases where you want a separate URI Xrdf to denote the
document containing the steaming pile of RDF triples describing X
then (in addition to use of 303s) you have the option to include

X wdr:describedby Xrdf .

Thus if X denotes a book then you can describe the license for the
book and the license for the description of the book separately.

Dave



Dave,

What developer profile is going to perform the following:

1. make the relation -- at resource creation time
2. comprehend the relation -- at resource consumption time.


They don't have to. That's the point. This is removing a tax, not adding 
one.


A developer who wants to use a URI to denote something can just publish 
RDF at that URI and they are done. They don't *have* to enter the world 
of 303's and httpRange-14.


However, *if* that developer wants to also say something about the RDF 
document they have published (e.g. provenance or licensing) *then* they 
have the option to create a second URI for that and relate the two by 
any of the three mechanisms described (303, link header, wdr:describedby 
relation).


A particular beauty of the change proposal is that it allows the 
developer to take the easy path first and then, if later they find a 
need to reference the RDF document itself, they can refactor to do that. 
The entity URIs don't change.


Lower barrier to entry, but you don't get trapped into a corner if you 
enter this way.


Dave



Re: PURLs don't matter, at least in the LOD world

2012-02-18 Thread Dave Reynolds

On 17/02/12 21:08, Kingsley Idehen wrote:

On 2/17/12 2:18 PM, David Booth wrote:

On Fri, 2012-02-17 at 18:48 +, Hugh Glaser wrote:
[ . . . ]

What happens if I have http://purl.org/dbpedia/Tokyo, which is set to
go to http://dbpedia.org/resource/Tokyo?
I have (a), (b) and (c) as before.
Now if dbpedia.org goes Phut!, we are in exactly the same situation -
(b) gets lost.

No, the idea is that the administrator for http://purl.org/dbpedia/
updates the redirect, to point to whatever new site is hosting the
dbpedia data, so the http://purl.org/dbpedia/Tokyo still works.




David,

But any admin that oversees a DNS server can do the same thing. What's
special about purl in this context?


Precisely that they don't require an admin with power over the DNS 
registration :)


To me the PURL design pattern is about delegation authority and it's an 
important pattern.


Two specific use cases at different extremes:

(1) An individual is creating a small vocabulary that they would like to 
see used widely but don't have a nice brand-neutral stable domain of 
their own they can use for the purpose. This one has already been 
covered in the discussion.


(2) I'm a big organization, say the UK Government. I want to use a 
particular domain (well a set of subdomains) for publishing my data, say 
*.data.gov.uk. The domain choice is important - it has credibility and 
promises long term stability.  Yet I want to decentralize the 
publication itself, I want different departments and agencies to publish 
data and identifiers within the subdomains. The subdomains are supposed 
to be organization-neural yet the people doing the publication will be 
based in specific organizations. The PURL design pattern (though not 
necessarily the specific PURL implementation) is an excellent way to 
manage the delegation that makes that possible.


So my summary answer to Hugh is - they are much more important to the 
publisher than to the consumer.


Dave



Re: How to express something is-located-at an org:Site

2011-11-10 Thread Dave Reynolds
On Thu, 2011-11-10 at 09:42 -0500, Luis Bermudez wrote: 
 Regarding location, there is another W3C activity, that is defining how to 
 express Points of Interest [1] . 

Thanks for the pointer, I wasn't aware of that.

 Maybe site is a POI. 

Don't think it is, but another near match. That page says that a POI has
a geographic location, it's location might change over time but it seems
to be required to have one. Doesn't seem to encompass any notion of
virtual location. 

Dave

 
 [1] http://www.w3.org/2010/POI/wiki/Main_Page





Re: How to express something is-located-at an org:Site

2011-11-08 Thread Dave Reynolds
[Maintaining the cc: list, apologies for the cross posting. In fact the
official home for discussion on maintenance of org is the
public-egov-ig. I'll assume for now that public-gld-wg is sufficient.]

On Tue, 2011-11-08 at 17:21 +0100, Jakob Voss wrote: 
 On 08.11.2011 15:11, Phil Archer wrote:
 
  Thanks for raising this. Can I ask you what your use case is?
 
  The Government Linked Data Working Group [1] is chartered to look at
  this ontology and your use cases would be useful input to that process.
 
 In our union catalog we have  50 million copies of  30 million 
 publications in  80 libraries. I am about to create an RDF 
 representation of these libraries and their locations to express the 
 availability of documents in libraries. For this purpose I created the 
 DAIA ontology: http://purl.org/ontology/daia
 
 Here is an example in RDF/Turtle. The library my:library has closed 
 stacks (my:closedstacks), a main building (my:mainbuilding), and a 
 digital library access site (my:digitallibrary). For a particular 
 document (my:abstractdocument) it has three copies, one located at 
 each of the sites:
 
 @prefix schema: http://schema.org/ .
 @prefix org: http://www.w3.org/ns/org# .
 @prefix dct: http://purl.org/dc/terms/ .
 @prefix frbr: http://purl.org/vocab/frbr/core# .
 @prefix daia: http://purl.org/ontology/daia/ .
 @prefix bibo: http://purl.org/ontology/bibo/ .
 
 my:library a schema:Library, org:Organization ;
org:hasSite
  my:closedstacks,
  my:mainbuilding,
  my:digitallibrary .
 
 my:abstractdocument a bibo:Document ;
daia:exemplar
  my:book1, my:book2, my:ebook1 .
 
 my:book1 a frbr:Item
 dct:spatial my:closedstacks .
 
 my:book2 a frbr:Item
 dct:spatial my:mainbuilding .
 
 my:ebook1
 dct:spatial my:digitallibrary .
 
 The current solution is to use dct:spatial and an additional class that 
 combines org:Site and dct:Location:
 
 daia:Storage a owl:Class ;
rdfs:comment A place where instances of frbr:Item are stored.@en ;
rdfs:subClassOf dct:Location, org:Site .
 
 It would be more convenient to know that every org:Site is a dct:Site.
 
 Jakob
 





Re: How to express something is-located-at an org:Site

2011-11-08 Thread Dave Reynolds
Hi Jakob,

I understand your use case and personally would not be adverse to adding
an aligned superproperty for org:hasSite.

The question is what one?

As you point out, org:Site is supposed to encompass non-physical sites.
This was to cater for organizations which use, for example, shared
virtual offices. Indeed I assume your my:digitallibrary is not a
physical location.

The trouble is that dct:Location is described as a spatial region or
named place; dbpedia:Place and http://schema.org/Place seem to be
definitely physical spatial locations. That seems to make them
unsuitable super classes for org:Site and presumably not suitable for
your digital library use case.

Can anyone with deeper understanding of DCT comment on whether this is
too narrow a view of dct:spatial/dct:Location - could a virtual office
or digital library be reasonably treated as a dct:Location?

Dave

P.S. Didn't cross post to public-gld-wg because I'm not allowed to :(

P.P.S. Apologies for any email noise from a previous failed send
attempt.


On Tue, 2011-11-08 at 14:49 +0100, Jakob Voss wrote: 
 Hi,
 
 The Organization Ontology as described at
 
 http://www.epimorphics.com/public/vocabulary/org.html
 
 contains org:Site for location information, both physical and 
 non-physical. There are properties to connect organizations and sites 
 (org:hasSite / org:siteOf) and to connect People and sites 
 (org:basedAt). But these properties have no general super-property to 
 express that something (not necessarily an org:Organization or 
 foaf:Person) is located at an org:Site.
 
 I found the following properties that may match:
 
 1. dcterms:spatial 
 (http://dublincore.org/documents/dcmi-terms/#terms-spatial) has range 
 dcterms:Location 
 (http://dublincore.org/documents/dcmi-terms/#classes-Location) for A 
 spatial region or named place
 
 2. http://dbpedia.org/ontology/location has range
 http://dbpedia.org/ontology/Place for Immobile things or locations
 
 3. http://schema.org/location has range http://schema.org/Place which is 
 for Entities that have a somewhat fixed, physical extension.
 
 Each choice would make org:Site a subclass of or equivalent to another 
 class for places. I'd prefer not to create yet another property but use 
 an existing one, so could the Organization Ontology be aligned to one of 
 the three ontologies listed above?
 
 Thanks
 Jakob
 





Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Dave Reynolds

Hi Leigh,

On 21/10/2011 08:04, Leigh Dodds wrote:

Hi Dave,

Thanks for the response, there's some good examples in there. I'm glad
that this thread is bearing fruit :)

I had a question about one aspect, please excuse the clipping:


Clipping is the secret to focused email discussions :)


On 20 October 2011 10:34, Dave Reynoldsdave.e.reyno...@gmail.com  wrote:

...
If you have two resources and later on it turns out you only needed one,
no big deal just declare their equivalence. If you have one resource
where later on it turns out you needed two then you are stuffed.


Ed referred to refactoring. So I'm curious about refactoring from a
single URI to two. Are developers necessarily stuffed, if they start
with one and later need two?

For example, what if I later changed the way I'm serving data to add a
Content-Location header (something that Ian has raised in the past,
and Michael has mentioned again recently) which points to the source
of the data being returned.

Within the returned data I can include statements about the document
at that URI referred to in the Content-Location header.

Doesn't that kind of refactoring help?


Helps yes, but I don't think it solves everything.

Suppose you have been using http://example.com/lovelypictureofm31 to 
denote M31. Some data consumers use your URI to link their data on M31 
to it. Some other consumers started linking to it in HTML as an IR 
(because they like the picture and the accompanying information, even 
though they don't care about the RDF). Now you have two groups of users 
treating the URI in different ways. This probably doesn't matter right 
now but if you decide later on you need to separate them then you can't 
introduce a new URI (whether via 303 or content-location header) without 
breaking one or other use. Not the end of the world but it's not a 
refactoring if the test cases break :)


Does that make sense?

Dave



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Dave Reynolds

On 21/10/2011 12:52, Leigh Dodds wrote:

Hi,

On 21 October 2011 08:47, Dave Reynoldsdave.e.reyno...@gmail.com  wrote:

...

On 20 October 2011 10:34, Dave Reynoldsdave.e.reyno...@gmail.comwrote:


...
If you have two resources and later on it turns out you only needed one,
no big deal just declare their equivalence. If you have one resource
where later on it turns out you needed two then you are stuffed.


Ed referred to refactoring. So I'm curious about refactoring from a
single URI to two. Are developers necessarily stuffed, if they start
with one and later need two?

For example, what if I later changed the way I'm serving data to add a
Content-Location header (something that Ian has raised in the past,
and Michael has mentioned again recently) which points to the source
of the data being returned.

Within the returned data I can include statements about the document
at that URI referred to in the Content-Location header.

Doesn't that kind of refactoring help?


Helps yes, but I don't think it solves everything.

Suppose you have been using http://example.com/lovelypictureofm31 to denote
M31. Some data consumers use your URI to link their data on M31 to it. Some
other consumers started linking to it in HTML as an IR (because they like
the picture and the accompanying information, even though they don't care
about the RDF). Now you have two groups of users treating the URI in
different ways. This probably doesn't matter right now but if you decide
later on you need to separate them then you can't introduce a new URI
(whether via 303 or content-location header) without breaking one or other
use. Not the end of the world but it's not a refactoring if the test cases
break :)

Does that make sense?


No, I'm still not clear.

If I retain the original URI as the identifier for the galaxy and add
either a redirect or a Content-Location, then I don't see how I break
those linking their data to it as their statements are still made
about the original URI.

But I don't see how I'm breaking people linking to it as if it were an
IR. That group of people are using my resource ambiguously in the
first place. Their links will also still resolve to the same content.


Ah OK. So you introduce a new, different IR, but preserve the conneg so 
that old HTML pages links to the picture still resolve. Yes you are 
right, I think that does work.


Dave



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-20 Thread Dave Reynolds
Hi Leigh,

On Wed, 2011-10-19 at 17:59 +0100, Leigh Dodds wrote:

 So, can we turn things on their head a little. Instead of starting out
 from a position that we *must* have two different resources, can we
 instead highlight to people the *benefits* of having different
 identifiers? That makes it more of a best practice discussion and one
 based on trade-offs: e.g. this class of software won't be able to
 process your data correctly, or you'll be limited in how you can
 publish additional data or metadata in the future.

Nice approach. Here's an attempt ...

Benefit 1: You can provide (meta)data separately about the IR and NIR

Sometimes the IR contains additional information (e.g. crafted BBC web
pages) or was produced by a non-trivial transformation from the NIR. In
those cases metadata such as license, copyright and provenance
information differ between the IR and NIR. Hence you need two
identifiers.

Counter argument: this is problematic anyway. If your IR can conneg to
both an HTML and an RDF representation then by webarch they should be
equivalent. So a handcrafted web page with different license terms is
not a presentation of the NIR it is just some interesting semi-related
web page :)


Benefit 2: Conceptual cleanliness and hedging your bets

In the field of human debate, as opposed to what machines do, we are now
clear that the map is not the territory but we weren't always so clear
and that led to confusion and erroneous arguments[1]. That learning may
be transferable. Even if we can't spot the practical problems right now
then differentiating between the galaxy itself and some piece of data
about the galaxy could turn out to be important in practice.

If you have two resources and later on it turns out you only needed one,
no big deal just declare their equivalence. If you have one resource
where later on it turns out you needed two then you are stuffed.


Cost 1: You have to decide if your resource is an IR or NIR and we can't
always

If you are going to have a distinction like IR/NIR you'd better be able
to explain it and work out which is which. We can't. It's OK for real
world objects which clearly can't go down the wire[2]. But anything
conceptual can be argued both ways - skos:Concepts, skos:ConceptSchemes,
qb:DataSets, rdf:Properties, eg:theColourRed. 
Person A: you can get your ontology / skos description / glossary entry
down the wire, that's all there is, so they are IRs. 
Person B: abstract concept can't go down the wire so they are NIRs.
Deadlock.


Cost 2: Network cost - an uncachable round triple every time I look up a
data resource

Counter argument: just use #


Cost 3: Developer confusion/disbelief, inhibiting use

The clear cut cases like galaxies ([2] notwithstanding) are so silly
than no one thinks this confusion could ever arise. For the less clear
cases like skos:Concepts the discussion seems like dancing on the heads
of pins. Followed by if this distinction is so important why is there
no a way to tell that I have an NIR - the http-range-14 solution only
says that it could be an NIR. 

The need to understand, implement and argue about this distinction
without the benefits actually being apparent *right now* *to me* is a
serious barrier to uptake.



Personally I find the costs more persuasive than the benefits but I've
tried to present the arguments neutrally.

Dave

[1] IANAP and can't even spell Korzybski without Google's help :)

[2] You can take this line further. Arguably eg:theMilkyWay is never
going to represent the galaxy itself, it is only ever a
conceptualization of it and that conceptualization *can* be encoded in
some language and sent down the wire. We are *never* really talking
about territories we are always talking about maps and postit notes
stuck on maps.




Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-20 Thread Dave Reynolds
Hi Norman,

On Thu, 2011-10-20 at 12:13 +0100, Norman Gray wrote:

 On 2011 Oct 20, at 10:34, Dave Reynolds wrote:

  Benefit 1: You can provide (meta)data separately about the IR and NIR
  [...]
  Counter argument: this is problematic anyway. If your IR can conneg to
  both an HTML and an RDF representation then by webarch they should be
  equivalent.
 
 Where is this written (I can't find support for this in a quick search 
 through http://www.w3.org/TR/webarch/?

Good point. I've been in a number of discussions where equivalence (up
to some fuzzy notion of quality) has been assumed to be required.
However, you are right, I can't see any documentary evidence backing up
that assumption. Happy to relegate it to a red herring.

  Benefit 2: Conceptual cleanliness and hedging your bets
  
  [...]Even if we can't spot the practical problems right now
  then differentiating between the galaxy itself and some piece of data
  about the galaxy could turn out to be important in practice.
 
 It is.  I want to say that 'line 123 in this catalogue [an existing RDBMS] 
 and line 456 in that one both refer to the same galaxy, but they give 
 different values for its surface brightness'.  There's no way I can 
 articulate that unless I'm explicitly clear about the difference between a 
 billion suns and a database row.

Sure, differentiating *those* two is crucial but http-range-14 doesn't
itself solve that [*] any more than inserting a # character would.

Perhaps benefit 2 could be reframed as being about forcing you to
confront the map/territory distinction so you end up doing better
modelling - whether or not you implement 303s.

  Cost 1: You have to decide if your resource is an IR or NIR and we can't
  always
  
  If you are going to have a distinction like IR/NIR you'd better be able
  to explain it and work out which is which. We can't. It's OK for real
  world objects which clearly can't go down the wire[2]. But anything
  conceptual can be argued both ways - skos:Concepts, skos:ConceptSchemes,
  qb:DataSets, rdf:Properties, eg:theColourRed. 
 
  Person A: you can get your ontology / skos description / glossary entry
  down the wire, that's all there is, so they are IRs. 
 
 OK, I can see this point.
 
 I think your Person A is being either difficult or dense, 

Or accurately understanding that an ontology is a conceptualization, a
model, it is not reality :)

 but supposing it genuinely is that hard to draw a distinction in some case, 
 then it is probably correspondingly unlikely that there are importantly 
 different things to say about the putative IR and NIR, so the distinction may 
 not in fact matter.

Sure. The issue is around spending lots of time and energy in
discussions around that in cases where it doesn't change any outcomes.
Especially in domains where this is the sort of information that
dominates.

  Cost 3: Developer confusion/disbelief, inhibiting use
  
  The clear cut cases like galaxies ([2] notwithstanding) are so silly
  than no one thinks this confusion could ever arise. For the less clear
  cases like skos:Concepts the discussion seems like dancing on the heads
  of pins. Followed by if this distinction is so important why is there
  no a way to tell that I have an NIR - the http-range-14 solution only
  says that it could be an NIR. 
  
  The need to understand, implement and argue about this distinction
  without the benefits actually being apparent *right now* *to me* is a
  serious barrier to uptake.
 
 I think the above argument works here, too.  If a provider can't see the 
 distinction, they're probably not going to say anything usefully distinct 
 about the two resources.
 
 Perhaps that should be the resolution: Dear Developer, there's a right way 
 to do this, and a less right way: the right way probably gives benefit to you 
 and is better for your data's consumers, but if you do it the other way, the 
 world won't end.  Love and kisses, public-lod.

The last part of that is right on - either can be made to work, don't
kill yourself over it.

I think the discussion Leigh was trying to start was can we more
clearly article those benefits of the 'right way'. I was taking a shot
a that, maybe a very limited off-target one.

 Is the argument actually about data _consumers_ getting confused about the 
 distinction?  Really?  I'd have thought that, once you've grokked RDF, you're 
 in a good place to understand the distinction fairly naturally, and in any 
 case you are by that stage looking at a screenful of RDF which is describing 
 a URI whose internal structure and 30xs you no longer have to care about.  I 
 bet I'm missing a use-case.

No the issue is more for data publishers IMHO.

What's more I really don't think the issues is about not understanding
about the distinction (at least in the clear cut cases). Most people I
talk to grok the distinction, the hard bit is understanding why 303
redirects is a sensible way of making it and caring about it enough to
put those

Re: Address Bar URI

2011-10-18 Thread Dave Reynolds
Hi Michael,

On Tue, 2011-10-18 at 10:57 +0100, Michael Smethurst wrote:

 All of the problems mentioned in this thread could be solved with the
 addition of a *generic* information resource URI that does the conneg
 separately from the 303. Target the *generic* information resource in
 your links and expose that in the address bar, keep the details of the
 specific representation URL tucked away in content location headers
 and just use the non-information resource as something to talk about.
 So you don't split the URIs you expose to the web and don't bounce
 every request through a 303 and don't need to use replaceState to
 replace the representation URL with something more sharable
 
 In the absence of a generic information resource URI you've only got
 two choices about what ends up in the address bar: the NIR URI or the
 specific representation URL. IMO it should be none of the above. The
 latter breaks sharing and the former doesn’t make sense

I agree with all you say about the separation of concerns between IR/NIR
and representation choice, that you should not confuse redirection (for
the former) and conneg (for the latter) and that you should have a
generic IR URI.

However, that does not solve all the presenting user problems.

The problem, as I see it, is that developers start from the NIR but then
use web browsers to find their way round the data and then cut paste the
browser locations they find, thus ending up with IRs where they should
have had NIRs. At least that's what I took Hugh's proposal be be aimed
at.

To make it concrete ... the linked data from data.gov.uk follows the
same pattern as you recommend. For example the NIR for one particular
bathing water is:

http://environment.data.gov.uk/id/bathing-water/ukc2102-03600

This redirects to the generic IR URI:

http://environment.data.gov.uk/doc/bathing-water/ukc2102-03600

The representation is chosen via conneg (there are
representation-specific uris such as .rdf and .json available but simply
following [NIR - IR - representation] does not expose them in the
address bar).

The problem is that a developer trying to use this data starts off with
an NIR from some data set. They want to find some connected resource
they can use in their app. They have been told that an advantage of
linked data is that the URIs are deferenceable and return useful
information like onward links. So they put the NIR in their browser.
They click round to find the information they want, say, the Sampling
Point for the that Bathing Water. They then cut that URI from the
browser bar:

http://location.data.gov.uk/doc/ef/SamplingPoint/bwsp.eaew/03600

and paste it in their app and/or publish more RDF referring to it. 

I've lost track of the number of times I've seen in published RDF links
to (generic) IR URIs instead of the NIR URIs, presumably as a result of
this pattern of use. I've even done it myself, at least in email
discussions, and I'm definitely supposed to know better!

This is especially painful for this sort of abstract data where the IR
URI isn't really of much use to anyone. It can't see many people writing
web pages linking to the IR. For them the html pages are just a more
readable rendering of the underlying data to help them understand what's
in there. So Hugh's suggestion would actually work quite nicely in such
cases, while being rather inappropriate for the BBC case.

Dave





Re: Address Bar URI

2011-10-18 Thread Dave Reynolds
On Tue, 2011-10-18 at 15:16 +0100, Michael Smethurst wrote: 
 
 
 On 18/10/2011 12:26, Dave Reynolds dave.e.reyno...@gmail.com wrote:
 
  Hi Michael,
  
  On Tue, 2011-10-18 at 10:57 +0100, Michael Smethurst wrote:
  
  All of the problems mentioned in this thread could be solved with the
  addition of a *generic* information resource URI that does the conneg
  separately from the 303. Target the *generic* information resource in
  your links and expose that in the address bar, keep the details of the
  specific representation URL tucked away in content location headers
  and just use the non-information resource as something to talk about.
  So you don't split the URIs you expose to the web and don't bounce
  every request through a 303 and don't need to use replaceState to
  replace the representation URL with something more sharable
  
  In the absence of a generic information resource URI you've only got
  two choices about what ends up in the address bar: the NIR URI or the
  specific representation URL. IMO it should be none of the above. The
  latter breaks sharing and the former doesn¹t make sense
  
  I agree with all you say about the separation of concerns between IR/NIR
  and representation choice, that you should not confuse redirection (for
  the former) and conneg (for the latter) and that you should have a
  generic IR URI.
  
  However, that does not solve all the presenting user problems.

[snip]

  This is especially painful for this sort of abstract data where the IR
  URI isn't really of much use to anyone. It can't see many people writing
  web pages linking to the IR. For them the html pages are just a more
  readable rendering of the underlying data to help them understand what's
  in there. So Hugh's suggestion would actually work quite nicely in such
  cases, while being rather inappropriate for the BBC case.
 
 Yes, I can see the problem. Some data sets are primarily aimed at the
 research community and some of those researchers are going to be primarily
 interested in NIRs

[I think we agree but I couldn't let that phrasing pass unchecked...]

You don't have to be a researcher to be interested in data :)

The developers using the bathing water data are doing things like
building consumer apps to enable people check water quality at beaches
they are near. Not academic research.

There may be a distinction between data sets that are generally
consumed as pure data and data sets which mostly augment normal web
pages. But I don't think that's necessarily a research/real-world
distinction. 

 Not sure how you solve that except specifying up front the geek quotient
 level of the intended user community and publishing different ways for
 different folks

:)

 But if we do want linked data to be adopted more generally and not confined
 to the lab then we do need publishing guidelines that work for normal
 sites and normal users. I think that means following the patterns of
 data.gov.uk and the rest is developer education?!?

Probably.

[Personally I think we'd be better off without the IR/NIR distinction,
failing that stick to fragids for NIRs. But I see no value in going
round the block on those arguments yet again!]

Dave





Re: Minimum useful linked data

2011-09-03 Thread Dave Reynolds
[Left dist list in place but seems a little broad.]

Hi Danny,

On Sat, 2011-09-03 at 19:34 +0200, Danny Ayers wrote: 
 
 On the other hand the linked data API has this covered, e.g. in the
 deployment example [2]:
 
  /doc/school/12345 should respond with a document that includes
 information about /id/school/12345 (a concise bounded description of
 that resource)
 
 Except that to work with a CBD, reasonable knowledge is needed of RDF
 *and* there isn't really a friendly mapping from arbitrary graphs to
 JSON.

Actually a developer doesn't need to know anything about CBD for the
Linked Data API (LDA). 

The person who defines the Linked Data API spec for a given endpoint
does need to understand RDF but the consumer can work purely with
developer friendly JSON or (non-namespace) XML..

In fact most API endpoints define views of subsets of properties [3],
though a describe view is always available. What's more Jeni's style
sheet for data.gov.uk provides nice HTML browsing of the views available
from the end point automatically generated from the LDA metadata, see
[4] for instance.

If you select the all view (which is a complete Describe and select
JSON then you see [5]. Apart from there being a heck of a lot of
properties in that data I don't think the resulting JSON is too bad and
the default view [6] is really quite usable.

 But surely most of the immediately useful information (and ways to
 find further information) about the resource /doc/school/12345 will
 be contained in the triples matching:
 
 /doc/school/12345 ?pA ?o .
 ?s ?pB /doc/school/12345 .
 
 where ?o and ?s *are not bnodes*

Yes in the Linked Data API you can specify this and select the limited
sets of p's for each view. We did find we need to be able to selectively
pull back deeper nested structure to make the JSON API usable. The most
common case being labels on resources but we also have cases like vcard
where you want to pull back substructure. See the address part of [6]
for example. That way the person who specifies the LDA for a given
dataset can create nice self contained views of the data includes nested
structures so that the consumer just has the traverse the JSON and not
have the manually walk a load of links just to get one logical record.

Cheers,
Dave

[3] http://code.google.com/p/linked-data-api/wiki/API_Viewing_Resources
[4] http://education.data.gov.uk/doc/school/100866
[5] http://education.data.gov.uk/doc/school/100866.json?_view=all
[6] http://education.data.gov.uk/doc/school/100866.json






Multi-lingual labels for org ontology

2011-09-01 Thread Dave Reynolds
Thanks to Dominique Guardiola the org ontology [1][2] now has French
translations for the label/comment/title strings.

It's good to see multi-lingual support in semantic web ontologies and
I'm very grateful to Dominique for volunteering to do this translation.

Dave

[1] http://www.w3.org/ns/org#
[2] http://www.epimorphics.com/public/vocabulary/org.html




Re: Dataset URIs and metadata.

2011-07-22 Thread Dave Reynolds
On Fri, 2011-07-22 at 09:59 +0100, Michael Hausenblas wrote: 
 Frans,

[snip]

  Probably VoID metadata/dataset URIs will be easier to discover once  
  the /.well-known/void trick (described in paragraph 7.2 of the W3C  
  VoID document) is widely adopted.
 
 greed. But it's not a 'trick'. It's called a standard.

Is it?

There was me thinking it was a Interest Group Note.

Is there a newer version than:
http://www.w3.org/TR/2011/NOTE-void-20110303/

?

Dave





Re: Dataset URIs and metadata.

2011-07-22 Thread Dave Reynolds
On Fri, 2011-07-22 at 15:42 +0100, Michael Hausenblas wrote: 
 
  Probably VoID metadata/dataset URIs will be easier to discover once
  the /.well-known/void trick (described in paragraph 7.2 of the W3C
  VoID document) is widely adopted.
 
  greed. But it's not a 'trick'. It's called a standard.
 
  Is it?
 
 Yes, I think that RFC5785 [1] can be considered a standard. Unless you  
 want to suggest that RFCs are sorta not real standards :P

:)

I'm aware that /.well-known is standardized in RFC5785.

It was the the claim that /.well-known/void is a standard that I was
surprised by. It's the sort of thing that could easily be on a Rec track
somewhere, I just wasn't aware of it.

FWIW I'm perfectly happy with VoID's current status as an Interest Group
note.

Cheers,
Dave

 On 22 Jul 2011, at 15:39, Dave Reynolds wrote:
 
  On Fri, 2011-07-22 at 09:59 +0100, Michael Hausenblas wrote:
  Frans,
 
  [snip]
 
  Probably VoID metadata/dataset URIs will be easier to discover once
  the /.well-known/void trick (described in paragraph 7.2 of the W3C
  VoID document) is widely adopted.
 
  greed. But it's not a 'trick'. It's called a standard.
 
  Is it?
 
  There was me thinking it was a Interest Group Note.
 
  Is there a newer version than:
  http://www.w3.org/TR/2011/NOTE-void-20110303/
 
  ?
 
  Dave
 
 
 





Re: WebID vs. JSON (Was: Re: Think before you write Semantic Web crawlers)

2011-06-22 Thread Dave Reynolds
On Wed, 2011-06-22 at 15:52 +0100, Leigh Dodds wrote: 
 Hi,
 
 On 22 June 2011 15:41, William Waites w...@styx.org wrote:
  What does WebID have to do with JSON? They're somehow representative
  of two competing trends.
 
  The RDF/JSON, JSON-LD, etc. work is supposed to be about making it
  easier to work with RDF for your average programmer, to remove the
  need for complex parsers, etc. and generally to lower the barriers.
 
  The WebID arrangement is about raising barriers. Not intended to be
  the same kind of barriers, certainly the intent isn't to make
  programmer's lives more difficult, rather to provide a good way to do
  distributed authentication without falling into the traps of PKI and
  such.
 
  While I like WebID, and I think it is very elegant, the fact is that I
  can use just about any HTTP client to retrieve a document whereas to
  get rdf processing clients, agents, whatever, to do it will require
  quite a lot of work [1]. This is one reason why, for example, 4store's
  arrangement of /sparql/ for read operations and /data/ and /update/
  for write operations is *so* much easier to work with than Virtuoso's
  OAuth and WebID arrangement - I can just restrict access using all of
  the normal tools like apache, nginx, squid, etc..
 
  So in the end we have some work being done to address the perception
  that RDF is difficult to work with and on the other hand a suggestion
  of widespread putting in place of authentication infrastructure which,
  whilst obviously filling a need, stands to make working with the data
  behind it more difficult.
 
  How do we balance these two tendencies?
 
 By recognising that often we just need to use existing technologies
 more effectively and more widely, rather than throw more technology at
 a problem, thereby creating an even greater education and adoption
 problem?

+1

Don't raise barriers to linked data use/publication by tying it to
widespread adoption and support for WebID.

Dave





Re: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...]

2011-06-19 Thread Dave Reynolds
Hi Hugh,

 By the way, as is well-known I think, a lot of people use and therefore must 
 be happy with URIs that are not Range-14 compliant, such as 
 http://www.w3.org/2000/01/rdf-schema .

Your general point that there is non-compliant data out there that
people are still able to make use of is probably right, but that
specific example is compliant - those are all (even the ontology URI)
hash-URIs.

Dave





Re: Squaring the HTTP-range-14 circle

2011-06-17 Thread Dave Reynolds
On Thu, 2011-06-16 at 21:22 -0400, Tim Berners-Lee wrote:

 On 2011-06 -16, at 16:41, Ian Davis wrote:

  The problem here is that there are so few things that people want to
  say about web pages compared with the multitude of things they want to
  say about every other type of thing in existence.
 
 Well, that is a wonderful new thing.  For a long while it was difficult to
 put data on the web, while there is quite a lot of metadata.
 Wonderful idea that the semantic web may be beating the document
 web hands down but that's not totally clear that we should trash the
 use of URIs for use to refer to documents as do in the document web.

I'm sure Ian wasn't claiming the data web is beating the document web
and equally sure that you don't really think he was :)

FWIW my experience is also that most of the data that people want to
publish *in RDF* is about things rather than web pages. Clearly there
*are* good use cases for capturing web page metadata in RDF but I've not
seen that many in-the-wild cases where people wanted to publish data
about *both* the web page and the thing.

That's why Ian's Back to Basics suggestion works for me [as a fall
back from just use #]. My interpretation is that, unlike most of this
thread, it wasn't saying use URIs ambiguously but saying the
interpretation of the URI is up to the publisher and is discovered from
the data not from the protocol response, it is legitimate to use a
http-no-# URI to denote a thing if that is what you really want to do.

Thus if I want to publish a table of e.g. population statistics at
http://foobar.gov.uk/datasets/population then I can do so and use that
URI within the RDF data as denoting the data set. As publisher I'm
saying this is a qb:DataSet not a web page, anything that looks like a
web page when you point a browser at it is just a rendering related to
that data and that rendering isn't being given a separate URI so you can
talk about it, sorry about that.

 If you use HTTP 200 for something different, then 
 you break my ability to look at a page, review it, and then
 express my review in RDF,  using the page's URI as the identifier.

Not quite. It is saying that you can't give a review for my
http://foobar.gov.uk/datasets/population web page because the RDF
returned by the URI says it denotes a dataset not the web page. You can
still review the dataset itself. You can review other web pages which
don't return RDF data saying they are something other than a web page.

[As an aside, I would claim that most reviews are in fact about things -
restaurants, books, music - not about the web pages.]

Dave





Re: 15 Ways to Think About Data Quality (Just for a Start)

2011-04-12 Thread Dave Reynolds
On Fri, 2011-04-08 at 21:10 -0400, glenn mcdonald wrote:
 I don't think data quality is an amorphous, aesthetic, hopelessly
 subjective topic. Data beauty might be subjective, and the same data
 may have different applicability to different tasks, but there are a
 lot of obvious and straightforward ways of thinking about the quality
 of a dataset independent of the particular preferences of individual
 beholders. Here are just some of them:
 
 
 1. Accuracy: Are the individual nodes that refer to factual
 information factually and lexically correct. Like, is Chicago spelled
 Chigaco or does the dataset say its population is 2.7?
 
 
 2. Intelligibility: Are there human-readable labels on things, so you
 can tell what a thing is when you're looking at? Is there a model, so
 you can tell what questions you can ask? If a thing has multiple
 labels (or a set of owl:sameAs things havemlutiple labels), do you
 know which (or if) one is canonical?
 
 
 3. Referential correspondence: If a set of data points represents some
 set of real-world referents, is there one and only one point per
 referent? If you have 9,780 data points representing cities, but 5 of
 them are Chicago, Chicago, IL, Metro Chicago, Metropolitain
 Chicago, Illinois and Chicagoland, that's bad.
 
 
 4. Completeness: Where you have data representing a clear finite set
 of referents, do you have them all? All the countries, all the states,
 all the NHL teams, etc? And if you have things related to these sets,
 are those projections complete? Populations of every country?
 Addresses of arenas of all the hockey teams?
 
 
 5. Boundedness: Where you have data representing a clear finite set of
 referents, is it unpolluted by other things? E.g., can you get a list
 of current real countries, not mixed with former states or fictional
 empires or adminstrative subdivisions?
 
 
 6. Typing: Do you really have properly typed nodes for things, or do
 you just have literals? The first president of the US was not George
 Washington^^xsd:string, it was a person whose name-renderings include
 George Washington. Your ability to ask questions will be constrained
 or crippled if your data doesn't know the difference.
 
 
 7. Modeling correctness: Is the logical structure of the data properly
 represented? Graphs are relational databases without the crutch of
 rows; if you screw up the modeling, your queries will produce
 garbage.
 
 
 8. Modeling granularity: Did you capture enough of the data to
 actually make use of it. :us :president :george_washington isn't
 exactly wrong, but it's pretty limiting. Model presidencies, with
 their dates, and you've got much more powerful data.
 
 
 9. Connectedness: If you're bringing together datasets that used to be
 separate, are the join points represented properly. Is the US from
 your country list the same as (or owl:sameAs) the US from your list of
 presidencies and the US from your list of world cities and their
 populations?
 
 
 10. Isomorphism: If you're bring together datasets that used to be
 separate, are their models reconciled? Does an album contain songs, or
 does it contain tracks which are publications of recordings of songs,
 or something else? If each data point answers this question
 differently, even simple-seeming queries may be intractable.
 
 
 11. Currency: Is the data up-to-date?
 
 
 12. Directionality: Can you navigate the logical binary relationships
 in either direction? Can you get from a country to its presidencies to
 their presidents, or do you have to know to only ask about presidents'
 presidencies' countries? Or worse, do you have to ask every question
 in permutations of directions because some data asserts things one way
 and some asserts it only the other?
 
 
 13. Attribution: If your data comes from multiple sources, or in
 multiple batches, can you tell which came from where?
 
 
 14. History: If your data has been edited, can you tell how and by
 whom?
 
 
 15. Internal consistency: Do the populations of your counties add up
 to the populations of your states? Do the substitutes going into your
 soccer matches balance the substitutes going out?

That's a fantastic list and should be recorded on a wiki somewhere!

A minor quibble, not sure about Directionality. You can follow an RDF
link in both directions (at least in SPARQL and any RDF API I've worked
with).  I would be inclined to generalize and rephrase this as ...

Consistency of modelling: whichever way you make modelling decisions
such as direction of relations (from country to president, from
president to country) it is done consistently so you don't have to ask
many permutations of the same query. 
Possible additions:

Licensed: the license under which the data can be used is clearly
defined, ideally in a machine checkable way.

Sustainable: there is some credible basis for believing the data will
be maintained as current (e.g. backed by some appropriate organization
or by a sufficiently large group of individuals, has 

Re: Linked Data, Blank Nodes and Graph Names

2011-04-07 Thread Dave Reynolds
Hi Nathan,

On Thu, 2011-04-07 at 18:45 +0100, Nathan wrote: 
 Hi All,
 
 To cut a long story short, blank nodes are a bit of a PITA to work with, 
 they make data management more complex, new comers don't get them 
 (lest presented as anonymous objects), and they make graph operations 
 much more complex than they need be, because although a graph is a set 
 of triples, you can't (easily) do basic set operations on non-ground 
 graphs, which ultimately filters down to making things such as graph 
 diff, signing, equality testing, checking if one graph is a super/sub 
 set of another very difficult. Safe to say then, on one side of things 
 Linked Data / RDF would be a whole lot simpler without those blank nodes.
 
 It's probably worth asking then, in a Linked Data + RDF environment:
 
 - would you be happy to give up blank nodes?

Happy, no.

From the point of view of data modelling and management I could live
without them, though I do find them helpful.

From the point of view of managing legacy, and having to maintain tool
chains that do that, then no. Maybe if they had never existed that
might have been better but the cost of putting the genie back in the
bottle is too great. Just imagine, for example, the cost re-specifying
OWL so it could be encoded in an RDF-without-blank nodes and that's just
one example.

 - just the [] syntax?

? What's the syntax got to do with it?

 - do you always have a name for your graphs? (for instance when 
 published on the web, the URL you GET, and when in a store, the ?G of 
 the quad?

Nope.

 I'm asking because there are multiple things that could be done:
 
 1) change nothing

+1

 2) remove blank nodes from RDF

-1

 3) create a subset of RDF which doesn't have blank nodes and only deals 
 with ground graphs

-1

That's may be the worst of both worlds. Then you would have tools which
only deal with ground RDF and tools that support and use blank nodes.
The simpler tools wouldn't even be able to parse large tracts of
existing data including the normative encoding of OWL. I don't see such
fragmentation as healthy.

 4) create a subset of RDF which does have a way of differentiating blank 
 nodes from URI-References, where each blank node is named persistently 
 as something like ( graph-name , _:b1 ), which would allow the subset to 
 be effectively ground so that all the benefits of stable names and set 
 operations are maintained for data management, but where also it can be 
 converted (one way) to full RDF by removing those persistent names.

How that does that solve anything?

Assuming the semantics is retained then to do any graph comparisons or
deltas you will still need to do the equivalent of graph isomorphism its
just that now you are matching nodes with an external arbitrary label
instead of ones which just have an internal arbitrary label. Doesn't
change *any* of the problems you list, even complicates things by having
one more concept to explain to people.

The one thing this approach would facilitate is essentially round
tripping back to the same graph in the same store. If you get a query
result containing leaf nodes you would then be guaranteed to be able to
ask for more about those leaves. I can see benefit in that and it would
be make to coexist with the current RDF but it doesn't touch the other
problems.

Dave





Re: Design issues 5-star data section tidy up

2011-03-10 Thread Dave Reynolds
On Thu, 2011-03-10 at 15:15 +0100, Adrian Pohl wrote: 
 Hello Martin,

[snip]

  And yes, I agree with Christopher that the extreme notion of open is an 
  ideology, not a technology. Being able to automate the evaluation of what 
  you can do with the data is a technology. Requesting that all data must 
  belong to everybody with no strings attached is ideology.
 
 Nobody requests that all data must belong to everybody with no
 strings attached - this is only when you want to get five stars.

You need it for *any* stars:

1 star - Available on the web (whatever format), but with an open
licence [1]

The point is that the 5-star scheme requires open in the legal sense as
a prerequisite for getting on the scale at all, open in the technical
interoperability sense just helps you get more stars.

Makes perfect sense for government data releases, which is the context
in which the scheme was developed I believe. 

 As I
 understand it the open requirement is very much in line with the
 history of the web as it evolves around open standards and was
 established to share knowledge. One has to respect that. It's
 compatibility (technical as well as legal) that matters, not ideology.
 
 You could write a commercial definition to define licensing
 standards for commercial data publishers to reach compatibility in the
 world of commercial data providers and non-open licenses...

Of course, and you would have achieved interoperability and great things
but following that wouldn't count for a single star on Tim's 5-star
scheme. Which I think is Martin's issue.

Dave

[1] http://www.w3.org/DesignIssues/LinkedData.html





Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Dave Reynolds
On Wed, 2011-01-19 at 21:45 +, Nathan wrote: 
 David Wood wrote:
  On Jan 19, 2011, at 10:59, Nathan wrote:
  ps: as an illustration of how engrained URI normalization is, I've 
  capitalized the domain names in the to: and cc: fields, I do hope the mail 
  still come through, and hope that you'll accept this email as being sent 
  to you. Hopefully we'll also find this mail in the archives shortly at 
  htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ - Personally I'd 
  hope that any statements made using these URIs (asserted by man or 
  machine) would remain valid regardless of the (incorrect?-)casing.
  
  Heh.  OK, I'll bite.  Domain names in email addressing are defined in IETF 
  RFC 2822 (and its predecessor RFC 822), which defers the interpretation to 
  RFC 1035 (Domain names - implementation and specification).  RFC 1035 
  section 2.3.3 states that domain names in DNS, and therefore in (E)SMTP, 
  are to be compared in a case-insensitive manner.
  
  As far as I know, the W3C specs do not so refer to RFC 1035.
 
 And I'll bite in the other direction, why not treat URIs as URIs? 

It seems to me the underlying question here is whether aliasing of URIs
(whether they dereference to the same resource) should imply semantic
equality (i.e. use as an identifier in a web logic language like RDF or
OWL).

The position so far in RDF, OWL and RIF has been no

As far as the specifications for those languages are concerned a URI is
just a convenient spelling for an identifier and they require
comparison of identifiers to be stable and context-independent. 
Those specs don't constrain what you get back from dereferencing some
URI U to include statements about U.

The URI spec (rfc3986[1]) does allow this usage. In particular Section 6
Normalization and Comparison says:

URI comparison is performed for some particular purpose.  Protocols 
or implementations that compare URIs for different purposes will
   often be subject to differing design trade-offs in regards to how
   much effort should be spent in reducing aliased identifiers.  This
   section describes various methods that may be used to compare URIs,
   the trade-offs between them, and the types of applications that might
   use them.

and

We use the terms different and
   equivalent to describe the possible outcomes of such comparisons,
   but there are many application-dependent versions of equivalence.

While RDF predates this spec it seems to me that the RDF usage remains
consistent with it. The purpose of comparison in RDF is different from
that of cache retrieval of web pages or message delivery of email.

This quote also makes clear that there is no single definitive
normalization. There are different levels of normalization possible
depending on your needs. 

Earlier you pointed out that the place where the URI specs and RDF do
collide is in resolving relative URIs into absolute URIs. Again rfc3986
does not preclude the RDF usage. Section 5.2.1 says:

Normalization of the base URI, as described in Sections 6.2.2 and 
   6.2.3, is optional.

So I claim that in terms of formal published specifications:
(1) RDF, OWL and RIF do not require any normalization of URIs (beyond
the character encoding level) and compare URIs by simple string
comparison.
(2) This usage is *not* precluded by the URI specs, at least by 3986
which sets the current framework for the application of scheme-specific
specs.

** Now we turn to linked data ...

As we've already mentioned :) there are no specs for linked data so we
move onto more subjective grounds.

The linked data convention is that dereferencing some URI U in your RDF
document should return information about U, including further onward
links. So if data set A spells a URI hTTp://example.com/foo but the data
you get from dereferencing that URI talks only about
http://example.com/foo then someone has a problem somewhere. The
question is who, where and how to fix it.

It seems to me that this is primarily a issue with publishing, and a
little about being sensible about how you pass on links. If I'm going to
put up some linked data I should mint normalized URIs; I should use the
same spelling of the URIs throughout my data; I'll make sure those URIs
dereference and that the data that comes back is stable and useful. If
someone else refers to my resources using an aliased URI (such as a
different case for the protocol) and makes statements about those
aliases then they have simply made a mistake.

To make sure that dereference returns what I expect, independent of
aliasing, then I should publish data with explicit base URIs (or just
absolute URIs). Publishing with relative URIs and no base is a recipe
for having your data look different from different places. Just don't do
it. No surprise there.

None of this requires us to force URI normalization into the heart of
identifier comparison in RDF itself. It is not a necessary solution and
it is not a sufficient one because there is no universal 

Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Dave Reynolds

Hi Nathan,

I largely agree but have a few quibbles :)

On 20/01/2011 2:29 PM, Nathan wrote:

Dave Reynolds wrote:

The URI spec (rfc3986[1]) does allow this usage. In particular Section 6
Normalization and Comparison says:

URI comparison is performed for some particular purpose. Protocols
or implementations that compare URIs for different purposes will
often be subject to differing design trade-offs in regards to how
much effort should be spent in reducing aliased identifiers. This
section describes various methods that may be used to compare URIs,
the trade-offs between them, and the types of applications that might
use them.

and

We use the terms different and
equivalent to describe the possible outcomes of such comparisons,
but there are many application-dependent versions of equivalence.

While RDF predates this spec it seems to me that the RDF usage remains
consistent with it. The purpose of comparison in RDF is different from
that of cache retrieval of web pages or message delivery of email.


Indeed, I also read though:

For all URIs, the hexadecimal digits within a percent-encoding
triplet (e.g., %3a versus %3A) are case-insensitive and therefore
should be normalized to use uppercase letters for the digits A-F.

When a URI uses components of the generic syntax, the component
syntax equivalence rules always apply; namely, that the scheme and
host are case-insensitive and therefore should be normalized to
lowercase...
- http://tools.ietf.org/html/rfc3986#section-6.2.2.1

And took the For all and always to literally mean for all and
always.


Those quotes come from section (6.2.2) describing normalization but the 
earlier quote is from the start of section 6 saying that choice of 
normalization is application dependent. I interpret the two together as 
*if* you are normalizing then always ...blah 


That was certainly the RIF position where we explicitly said that 
sections 6.2.2 and 6.2.3 of rfc3986 were not applicable.



against both the RDF Specification [1] and the URI specification when
they say /not/ to encode permitted US-ASCII characters (like ~ %7E)?


Where did that example come from?


The encoding consists of... %-escaping octets that do not correspond
to permitted US-ASCII characters.
- http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref

For consistency, percent-encoded octets in the ranges of ALPHA
(%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E),
underscore (%5F), or tilde (%7E) should not be created by URI
producers and, when found in a URI, should be decoded to their
corresponding unreserved characters by URI normalizers.
- http://tools.ietf.org/html/rfc3986#section-2.3

I read those quotes as saying do not encode permitted US-ASCII
characters in RDF URI References.


At what point have we suggested doing that?


As above


Sorry, I didn't mean to dispute that you shouldn't %-encode ~, I was 
wondering where the suggestion that you should do so came from.


I believe there are some corner cases, such as the handling of spaces, 
which differ between the RDF spec and the IRI spec. This was down to 
timing. The RDF Core WG was doing its best to anticipate what the IRI 
spec would look like but couldn't wait until that was finalized. 
Resolving any such small discrepancies between that anticipation and the 
actual IRI specs is something I believe to be in scope for the proposed 
new RDF WG.



So use normalized URIs in the first place.

...

RDF/OWL/RIF aren't designed the way they are because someone thought it
would be a good idea to allow such things to be used side by side or
because they *want* people to use denormalized URIs.

...

The point is that there is no single, simple, universal (i.e. across all
schemes) normalization algorithm that could be used.
The current approach gives stable, well-defined behaviour which doesn't
change as people invent new URI schemes. The RDF serializations give you
enough control to enable you to be certain about what URI you are
talking about. Job done.


Okay, I agree, and I'm really not looking to create a lot of work here,
the general gist of what I'm hoping for is along the lines of:

RDF Publishers MUST perform Case Normalization and Percent-Encoding
Normalization on all URIs prior to publishing. When using relative URIs
publishers SHOULD include a well defined base using a serialization
specific mechanism. Publishers are advised to perform additional
normalization steps as specified by URI (RFC 3986) where possible.

RDF Consumers MAY normalize URIs they encounter and SHOULD perform Case
Normalization and Percent-Encoding Normalization.

Two RDF URIs are equal if and only if they compare as equal, character
by character, as Unicode strings.


I sort of OK with that but ...

Terms like RDF Publisher and RDF Consumer need to be defined in 
order to make formal statements like these. The RDF/OWL/RIF specs are 
careful to define what sort of processors are subject to conformance 
statements and I don't think RDF

Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-19 Thread Dave Reynolds

On 19/01/2011 3:55 AM, Alan Ruttenberg wrote:


The information on how to fully determine equivalence according to the
URI spec is distributed across a wide and growing number of different
specifications (because it is schema dependent) and could, in
principle, change over time. Because of the distributed nature of the
information it is not feasible to fully implement these rules.
Optionally implementing these rules (each implementor choosing where
on the ladder they want to be) would mean that documents written in
RDF (and derivative languages) would be interpreted differently by
different implementations, which is an unacceptable feature of
languages designed for unambiguous communication. The fact that the
set of rules is growing and possibly changing would lead to a similar
situation - documents that meant one thing at one time could mean
different things later, which is also unacceptable, for the same
reason.


Well put, I meant to point out the implications of scheme-dependence and 
you've covered it very clearly.



David (Wood) clarifies (surprisingly to me as well) that the issue of
normalization could be addressed by the working group. I expect,
however, that any proposed change would quickly be determined to be
counter to the instructions given in the charter on Compatibility and
Deployment Expectation, and if not, would be rejected after justified
objections on this basis from reviewers outside the working group.


+1

Dave




Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-18 Thread Dave Reynolds
On Mon, 2011-01-17 at 18:16 +, Nathan wrote: 
 Dave Reynolds wrote:
  On Mon, 2011-01-17 at 16:52 +, Nathan wrote: 
  I'd suggest that it's a little more complex than that, and that this may 
  be an issue to clear up in the next RDF WG (it's on the charter I believe).
  
  I beg to differ.
  
  The charter does state: 
  
  Clarify the usage of IRI references for RDF resources, e.g., per SPARQL
  Query §1.2.4.
  
  However, I was under the impression that was simply removing the small
  difference between RDF URI References and the IRI spec (that they had
  anticipated). Specifically I thought the only substantive issue there
  was the treatment of space and many RDF processors already take the
  conservation position on that anyway.
 
 Likewise, apologies as I should have picked my choice of words more 
 appropriately, I intended to say that the usage of IRI references was up 
 for clarification, and if normalization were deemed an issue then the 
 RDF WG may be the place to raise such an issue, and address if needed.

OK, that makes sense.

 As for RIF and GRDDL, can anybody point me to the reasons why 
 normalization are not performed, does this have xmlns heritage?

Not as far as I know. At least in RIF we were just trying to be
compatible with the RDF specs which (cwm not withstanding) do not
specify normalization other than the IRI-compatible character encoding. 

Dave





Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-17 Thread Dave Reynolds
On Mon, 2011-01-17 at 16:51 +0100, Martin Hepp wrote: 
 Dear all:
 
 RFC 2616 [1, section 3.2.3] says that
 
 When comparing two URIs to decide if they match or not, a client   
 SHOULD use a case-sensitive octet-by-octet comparison of the entire
 URIs, with these exceptions:
 
- A port that is empty or not given is equivalent to the default
  port for that URI-reference;
- Comparisons of host names MUST be case-insensitive;
- Comparisons of scheme names MUST be case-insensitive;
- An empty abs_path is equivalent to an abs_path of /.
 
 Characters other than those in the reserved and unsafe sets (see
 RFC 2396 [42]) are equivalent to their % HEX HEX encoding.
 
 For example, the following three URIs are equivalent:
 
http://abc.com:80/~smith/home.html
http://ABC.com/%7Esmith/home.html
http://ABC.com:/%7esmith/home.html
 
 
 Does this also hold for identifying RDF resources
 
 a) in theory and

No. RDF Concepts defines equality of RDF URI References [1] as simply
character-by-character equality of the %-encoded UTF-8 Unicode strings.

Note the final Note in that section:


Note: Because of the risk of confusion between RDF URI references that
would be equivalent if derefenced, the use of %-escaped characters in
RDF URI references is strongly discouraged. 


which explicitly calls out the difference between URI equivalence
(dereference to the same resource) and RDF URI Reference equality.

BTW the more up to date RFC for looking at equivalence (as opposed to
equality) issues is probably the IRI spec [2] which defines a comparison
ladder for testing equivalence.

Dave

[1]
http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Graph-URIref

[2] http://www.ietf.org/rfc/rfc3987.txt




Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-17 Thread Dave Reynolds
On Mon, 2011-01-17 at 16:52 +, Nathan wrote: 
 Dave Reynolds wrote:
  On Mon, 2011-01-17 at 16:51 +0100, Martin Hepp wrote: 
  Dear all:
 
  RFC 2616 [1, section 3.2.3] says that
 
  When comparing two URIs to decide if they match or not, a client   
  SHOULD use a case-sensitive octet-by-octet comparison of the entire
  URIs, with these exceptions:
 
 - A port that is empty or not given is equivalent to the default
   port for that URI-reference;
 - Comparisons of host names MUST be case-insensitive;
 - Comparisons of scheme names MUST be case-insensitive;
 - An empty abs_path is equivalent to an abs_path of /.
 
  Characters other than those in the reserved and unsafe sets (see
  RFC 2396 [42]) are equivalent to their % HEX HEX encoding.
 
  For example, the following three URIs are equivalent:
 
 http://abc.com:80/~smith/home.html
 http://ABC.com/%7Esmith/home.html
 http://ABC.com:/%7esmith/home.html
  
 
  Does this also hold for identifying RDF resources
 
  a) in theory and
  
  No. RDF Concepts defines equality of RDF URI References [1] as simply
  character-by-character equality of the %-encoded UTF-8 Unicode strings.
  
  Note the final Note in that section:
  
  
  Note: Because of the risk of confusion between RDF URI references that
  would be equivalent if derefenced, the use of %-escaped characters in
  RDF URI references is strongly discouraged. 
  
  
  which explicitly calls out the difference between URI equivalence
  (dereference to the same resource) and RDF URI Reference equality.
 
 I'd suggest that it's a little more complex than that, and that this may 
 be an issue to clear up in the next RDF WG (it's on the charter I believe).

I beg to differ.

The charter does state: 

Clarify the usage of IRI references for RDF resources, e.g., per SPARQL
Query §1.2.4.

However, I was under the impression that was simply removing the small
difference between RDF URI References and the IRI spec (that they had
anticipated). Specifically I thought the only substantive issue there
was the treatment of space and many RDF processors already take the
conservation position on that anyway.

Replacing encoded string equality by deference-equivalence would be a
pretty big change to RDF and I hadn't realized that was being
considered.

Could one of the nominated chairs or a W3C rep clarify this?

 For example:
 
 When a URI uses components of the generic syntax, the component
 syntax equivalence rules always apply; namely, that the scheme and
 host are case-insensitive and therefore should be normalized to
 lowercase.  For example, the URI HTTP://www.EXAMPLE.com/ is
 equivalent to http://www.example.com/.
 
 - http://tools.ietf.org/html/rfc3986#section-6.2.2.1

Sure but the later RDF-related specs such as GRDDL and RIF clarify the
application of that in RDF. For example in RIF [1] we said:

Neither Syntax-Based Normalization nor Scheme-Based Normalization
(described in Sections 6.2.2 and 6.2.3 of RFC-3986) are performed.

A form of words that, I think, we lifted verbatim from GRDDL which in
turn had chosen them to clarify how the original RDF URI References spec
should be interpreted in the light of the updated URI/IRI RFCs.

Changing RDF to require syntax or scheme based normalization would
require changing at least RIF and GRDDL as well. If that was really on
the cards I would have expected it to have been more broadly publicized.

Dave

[1] http://www.w3.org/TR/2010/PR-rif-dtb-20100511/#Relative_IRIs





Re: Semantics of rdfs:seeAlso (Was: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?)

2011-01-13 Thread Dave Reynolds
On Thu, 2011-01-13 at 06:29 -0500, Tim Berners-Lee wrote:

 This is the Linked Open Data list.
 The Linked Data world is a well-defined bit of engineering.
 It has co-opted the rdf:seeAlso semantics of if you are looking up x load y 
 from the much 
 earlier FOAF work.  

Where is this well-defined bit of engineering defined in such a way
that makes that co-option clear? [*]

Assuming a particular use of rdfs:seeAlso as a convention for some
community (e.g. FOAF) that wants to adapt that particular pattern is
just fine.

Updating specs in the future to narrow the interpretation to support
this assumption usage might be OK, so long as due process is followed,
but that hasn't happened yet.

Complaining when others go by the existing spec does not seem
reasonable.

 The URI space is full of empty space waiting for you to define terms
 with whatever semantics you like for your own use.
 But one cant argue philosophically that for some reason 
 the URI rdfs:seeAlso should have some other meaning when people are using it 
 and 
 there have been specs.

Those specs support Martin's usage, as his quotes from them clearly
demonstrated.

 One *can* argue that the RDFS spec is definitive, and it is very loose in its 
 definition.

Loose in the sense of allowing a range of values but as a specification
it is unambiguous in this case, as Martin has already pointed out:

When such representations may be retrieved, no constraints are placed
on the format of those representations.

 We could look at maybe asking for an erratum to the spec
 to make it clear and introduce the other term int the same spec.

Or mint a sub-property of rdfs:seeAlso which provides the additional
constraints.

Dave

[*] And yes, I'm well aware of [1] which does mention the foaf
convention but it does so just as one convention in passing, there's no
clear suggestion in there that tools should rely on that convention for
arbitrary linked data.

[1] http://www.w3.org/DesignIssues/LinkedData.html 





Re: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?, existing predicate for linking to PDF?

2011-01-13 Thread Dave Reynolds
On Thu, 2011-01-13 at 11:43 +, Nathan wrote:

 linked data is not some term for data with links, it's an engineered 
 protocol which has constraints and requirements to make the whole thing 
 work.

Where is the spec for this engineered protocol and where in that spec
does it redefine rdfs:seeAlso?

[I believe I have reasonably decent understanding of, and experience
with, linked data. It is a useful set of conventions and practices
building on some underlying formal specifications. However, I'm not
aware of those practices being so universally agreed and formally
codified as to justify some of the claims being made in this thread.]

Dave





Re: Is vCard range restriction on org:siteAddress necessary?

2011-01-04 Thread Dave Reynolds
Hi Phil,

On Tue, 2011-01-04 at 10:32 +, Phil Archer wrote: 
 I'm doing a bit of grunt work on some data about companies and want to 
 remodel relevant sections using the org vocabulary [1]. But... I'd 
 rather not be forced to use vCard for the address info (because UK 
 addresses don't fit the vCard model particularly well. You can make them 
 fit, but it's a metric peg in an imperial-sized hole).

Is VCard that bad? It fits your example below just fine.

Part of the design goals were to reuse existing vocabularies where
possible. Since VCard is pretty widely used for contact details it
seemed like the obvious choice and preferable to getting bogged down in
defining another addressing vocabulary without a strong reason.

 Further, does address info need to be in a separate class from the site 
 info? In other words, what's the argument for /not/ doing:
 
 [] a org:Site ;   
ex:addressLine1 Unit 5 ;
ex:addressLine2 Exemplary Industrial Estate ;
ex:town Anytown ;
ex:county Anycounty ;
ex:country United Kingdom ;
os:postcode ...postcodeunit/EX11EX ; 
geo:lat 51.2569489 ;
geo:long -2.2007225 .
 
 
 [1] http://www.epimorphics.com/public/vocabulary/org.html

The separation between the Site and the address isn't necessary in
general, but it is necessary in order to reuse vcard. An org:Site isn't
a vcard:Address [*] hence the need for the indirection. 

I'm not sure it would make sense to drop the range restriction on
org:siteAddress, given that it is there specifically to support the
vcard style.

So are there alternative addressing vocabs we should be supporting
instead of, or as well as, vcard?

Or is there a flaw in vcard sufficient to justifying creating an org
extension to use for addressing instead of vcard?

Cheers,
Dave

[*] I think of it as a vcard:Address representing an address label, a
structured version of the vcard:Label formatted label, rather than a
geographic entity. For example, in vcard the geo coordinates are
associated with the VCard not the Address.




Re: Is vCard range restriction on org:siteAddress necessary?

2011-01-04 Thread Dave Reynolds
On Tue, 2011-01-04 at 13:28 +0100, William Waites wrote: 
 * [2011-01-04 11:49:43 +] Dave Reynolds dave.e.reyno...@gmail.com écrit:
 
 ] Is VCard that bad? It fits your example below just fine.
 
 The only problem I see with the example is that we don't have counties
 in Scotland, we have districts. In Quebec and Louisiana and other
 historically catholic places we have parishes. Is Scotland a state
 in the American sense, not really. You could use things like vc:county
 and vc:state and just say that the naming is bad, I guess.

Agreed, that's one reason not to make up another set of address terms
such as Phil's ex: examples.

The vcard terms (locality, region) strike me as reasonably neutral
whereas ex:county is not.

Dave





Re: Is vCard range restriction on org:siteAddress necessary?

2011-01-04 Thread Dave Reynolds
Hi Phil,

My inclination is to simply use VCard as is (including the
sub-resources) rather than try to short cut by collapsing the VCard and
Address. So I'd tend to write your example as:

blah a org:Site;
   org:siteAddress blah/vcard .

blah/vcard a v:VCard ;
   v:fn Blah Ltd (Headquarters);

   geo:lat 51.2569489 ;
   geo:long -2.2007225 ;

   v:geo [
   v:latitude  51.2569489 ;
   v:longitude -2.2007225 ;
   ];

   v:adr [
   v:extended-address Unit 5 ;
   v:street-address Example Industrial Estate ;
   v:locality Westbury ;
   v:region Wiltshire ;
   v:postal-code EX1 1EX
   v:country-name United Kingdom ;
   os:postcode .../postcodeunit/EX11EX ;

   v:label
 Unit 5,
Example Industrial Estate,
Westbury,
Wiltshire,
EX1 1EX,
United Kingdom 
   ];
   .

 Is this kind of modelling useful, over and above including address 
 info directly as properties of the site?

The advantages are:

o easy to consume by people who already know about (and have code for
handling) vcard - no need to cater for some new one-off address
vocabulary

o an org:Site could have multiple address vcards (an outlier
requirement, to be fair)

Disadvantages:

o slightly more complex to query (extra levels of indirection) and
easier to work with if you have CBD support


For me there was no other good option. Reuse trumps convenience. If I
had included address properties in Org I'm sure I would have had a lot
of don't reinvent wheels, just use Vcard comments :)

Dave

On Tue, 2011-01-04 at 15:30 +, Phil Archer wrote: 
 Thanks everyone for the replies. OK, I'm going to try and use vCard like 
 it says!
 
 Dave R - thanks for the example of one location having 2 addresses. I 
 thought that such a thing might be possible but couldn't think of an 
 example. I was thinking about shared office space but that didn't lead 
 to separate address labels for a single site within an org chart.
 
 Keith - yes, the RDF encoding of vCard is very good... at encoding vCard.
 
 Here's a repeat of my (very close to real world, Dave C) example:
 
 blah a org:Site ;
ex:addressLine1 Unit 5 ;
ex:addressLine2 Example Industrial Estate ;
ex:town Westbury ;
ex:county Wiltshire ;
ex:country United Kingdom ;
os:postcode .../postcodeunit/EX11EX ;
geo:lat 51.2569489 ;
geo:long -2.2007225 .
 
 If my understanding of vCard is correct then what follows is a valid 
 conversion (I don't claim that this is the only valid interpretation):
 
 blah a org:Site;
org:siteAddress blah/vcard .
 
 blah/vcard a v:Address ;
v:extended-address Unit 5 ;
v:street-address Example Industrial Estate ;
v:locality Westbury ;
v:region Wiltshire ;
v:postal-code EX1 1EX
v:country-name United Kingdom ;
os:postcode .../postcodeunit/EX11EX ;
geo:lat 51.2569489 ;
v:latitude  51.2569489 ;
v:longitude -2.2007225 ;
geo:long -2.2007225 ;
v:label
  Unit 5,
  Example Industrial Estate,
  Westbury,
  Wiltshire,
  EX1 1EX,
  United Kingdom .
 
 (btw I've gone back to the RFC for vCard [1] to find out that the order 
 of elements for an address is:
 
 post office box;
 extended address;
 street address;
 locality (e.g., city);
 region (e.g., state or province);
 postal code;
 country name.)
 
 I've used the top level class of 'Address' since it doesn't feel right 
 to call this a work address. Is it OK to go straight from an org:site to 
 v:Address without going through v:VCard first?
 
 Dave suggested using Label, which seems sensible, but that gets 
 confusing, I think anyway, when considering the label property, the 
 value for which is a pre-formatted string that can be printed and stuck 
 on an envelope (the triple quotes come courtesy of the Turtle spec at [2]).
 
 I've included the OS postcode data (which is one of the drivers for this 
 work in the first place) as well as the geo Lat/long.
 
 I guess the underlying question here is: is this what you have in mind, 
 Dave? Is this kind of modelling useful, over and above including address 
 info directly as properties of the site?
 
 Phil.
 
 [1] http://www.ietf.org/rfc/rfc2426.txt
 [2] http://www.w3.org/TeamSubmission/turtle/#longString
 
 On 04/01/2011 13:39, Dave Reynolds wrote:
  On Tue, 2011-01-04 at 12:38 +, Alexander Dutton wrote:
  On 04/01/11 11:49, Dave Reynolds wrote:
  The separation between the Site and the address isn't necessary in
  general, but it is necessary in order to reuse vcard. An org:Site isn't
  a vcard:Address [*] hence the need for the indirection.
 
  I think there's some confusion between the vCard and the address.
 
  You are right, my response wasn't clear - my head is not properly
  rebooted after the holidays :)
 
  At one point we had considered having org:siteAddress point directly to
  a vcard:Address, hence my confusing response.  However, in the end we
  wanted to allow all the vcard properties (not just addresses) without
  conflating a site with its

Re: Is vCard range restriction on org:siteAddress necessary?

2011-01-04 Thread Dave Reynolds
On Tue, 2011-01-04 at 11:02 -0500, Tim Berners-Lee wrote: 
 I wish the conflation of a VCard and a SocialEntity whose card it is
 were
 either ruled out completely or asserted completely by statements in
 the ontology.

+1

 I personally find that the class of business card is one which I do
 not
 want to have any data about.  (In fact for me it maps best 
 not to a node in the graph but to the RDF document whose contents is
 the graph.
 Important for provenance in that respect, but not part of this
 ontology).

The use case that I've come across has been bundling sets of contact
mechanisms together. 

For example, in local government there can be a main contact point for a
Council (with email, phone, mail address, web form) and then contact
points for out of hours use or for specific services (each potentially
with multiple contact mechanisms). 

This grouping-for-a-purpose is different from grouping by organization
or grouping by site (e.g. the council main offices may have both a
normal and an out-of-hours set of contact details). 

Dave

 
 My personal take on this in 1990 was the contact: ontology, which had
 the classes
 
 
 SocialEntity (subclasses: Person, Organization)
 and
 Location
 
 
 and properties 
 
 
 home, work, vacation
 
 
 link a Person (say) to a Location.  Locations 
 
 
 Similarly I could imagine properties like
 
 
 site, headquarters, deliveriesPlease, corporateSeat
 
 
 would link an Organization to a Location.
 
 
 (I was extra careful in making street, city, postcode, country
 properties of the address of a location not of the location itself,
 allowing a location to have 1 address, or two organizations to have
 notional locations which were different and had different phone
 numbers but the same address.
 I used it for mapping my contact stuff out of Outlook into RDF.  I
 needed assistant as Outlook has Asssitant phone number.)
 
 
 In all this a card has no useful place I can see.  Nor is there a
 1-1 correspondence between it and anything except for possibly
 SocialEntity.  So I would be in favour of the practice of translating
 VCards into information about a Social Entity (or an Organization or a
 Person), and not a card.
 
 
 Tim
 
 
 
 
 
 
 
 
 
 
 On 2011-01 -04, at 09:03, Dave Reynolds wrote:
 
  On Tue, 2011-01-04 at 13:28 +0100, William Waites wrote: 
   * [2011-01-04 11:49:43 +] Dave Reynolds
   dave.e.reyno...@gmail.com écrit:
   
   ] Is VCard that bad? It fits your example below just fine.
   
   The only problem I see with the example is that we don't have
   counties
   in Scotland, we have districts. In Quebec and Louisiana and other
   historically catholic places we have parishes. Is Scotland a
   state
   in the American sense, not really. You could use things like
   vc:county
   and vc:state and just say that the naming is bad, I guess.
  
  Agreed, that's one reason not to make up another set of address
  terms
  such as Phil's ex: examples.
  
  The vcard terms (locality, region) strike me as reasonably neutral
  whereas ex:county is not.
  
 
 
 Yes.  In fact, a convention for mapping between them
 would be useful, even if it is in the comments in the ontology
 so that if you click though from locality is says such as a city (US)
 or parish (Scotland).
 Guidance for ontology users in the ontology file is useful.
 
 
 (Presumably e.g. OSX's Address Book has defined this mapping as they
 will format all your addresses (whatever country they are in) in your
 chosen
 local style of any of many countries.)
 
 
 Tim
 
 
 
 
 
 
 
 Address
 type
 Class
 contact point
 type
 Class
 comment
 A place, or
 mobile situation,
 with address,
 phone number,
 fax, etc. Related
 to a person by
 home, office,
 etc. Note one
 person's
 workplace may be
 another person's
 home. A person
 may have more
 than one home and
 more than one
 workplace. (In
 practice it
 sometimes maybe
 useful with
 restriucted
 datasets to
 assume that this
 is not the case,
 when extracting
 data from other
 ontologies with
 no concept of
 ContactLocation).
 Strongly related
 to a person: in
 some ways a role
 that a person can
 be in.
 label
 contact point
 fax
 label
 fax
 subClassOf
 phone
 Female
 type
 Class
 Language Code
 type
 Class
 Male
 type
 Class
 mobile
 label
 mobile
 subClassOf
 phone
 Pager
 subClassOf
 phone
 Person
 comment
 A person in the
 normal sense of
 the word.
 subClassOf
 Social Entity
 phone
 type
 Class
 comment
 An end-point in
 the public
 swiitched
 telephone system.
 Anything
 identified by a
 URI with tel:
 scheme is in this
 class. 
 label
 phone
 tel.
 Social Entity
 type
 Class
 comment
 The sort of thing
 which can have a
 phone number.
 Typically a
 person or an
 incorporated
 company, or
 unincorporated
 group.
 subject to change
 label
 subject to change
 address Property
 type
 Property
 address
 type
 Property
 domain
 contact point
 label
 address
 range
 Address
 assistant
 type
 Property
 comment
 A person (or
 other agent) who

Re: Failed to port datastore to RDF, will go Mongo

2010-11-25 Thread Dave Reynolds
Hi Friedrich,

On Thu, 2010-11-25 at 00:43 +0100, Friedrich Lindenberg wrote:

 Anyway, I'd like to raise some additional points for the future: 
 
 1. I'd like to get a better picture of who is currently developing end-user 
 open government data applications based on linked data. Given that there is a 
 massive push towards releasing OGD as LD, I'd be eager to find out who is 
 consuming it in which kind of (user-facing) context, especially regarding 
 government transparency. More precisely: is RDF used primarily as an 
 interchange format or are there many people actively running sites using it? 

We have been doing a little of of this. In particular, we developed a
simple data explorer for the LOD local government spend data which we're
working out how best to make public.

This uses the Linked Data API [1] to expose the data from a triple store
and client-side javascript for the UI, though could equally well have
been done server side.

However, we are not a fair test case! [2]

[Aside: Having been actively involved the LOD side of UK open government
data then the term massive push isn't one that I would use, at least
not in that context! There has been genuine interest, some very hard
work by a few motivated people and some promising results but not that
much in the way of, say, resourcing. There *has* been some effective
publicity thanks to TimBL and Nigel Shadbolt but that has emphasized the
opening of data more than it has particular data representations, which
I'd regard as a good thing.]

Dave

[1] http://code.google.com/p/linked-data-api/

[2] We co-developed the ontology for publishing the data, co-developed
the spec (and an implementation) for the Linked Data API and are active
developers of the open source Jena RDF toolkit on which the backend of
this small app is based.





Re: data.gov.uk ontologies

2010-11-16 Thread Dave Reynolds
On Mon, 2010-11-15 at 19:50 -0200, Percy Enrique Rivera Salas wrote:
 Hello everyone
 
 I only found the educational ontology of data.gov.uk (school.rdf)
 
 Anybody knows, Where could I get the other ontologies of this dataset?

Which dataset, the education dataset?

If so that then http://education.data.gov.uk/def/school/ is really the
only relevant one.

Which other ones did you want?

There only two non standard ones mention in that dataset other than
school:

metahttp://education.data.gov.uk/def/meta/
- Used internally to annotate school.rdf with some additional label
information.

foundation  http://statistics.data.gov.uk/def/Foundation/
- Largely redundant and replaced by later ontologies such as org:

The schools data should be usable without either of these.

All of the concepts of those two ontologies should resolve individually,
e.g. [1] but an unfortunate effect of the data.gov.uk practices at the
time, and the way the redirection to the Talis hosting works, is that
you only get the description of a single concept back, not the enclosing
ontology.

If you really want these, despite all the caveats, then I can post them
somewhere.

Dave

P.S. In the fullness of time we'd like to see the education data updated
and shifted to fit in with newer data.gov.uk ontologies.

[1] http://education.data.gov.uk/def/meta/columnName





Re: What is a URL? And What is a URI

2010-11-12 Thread Dave Reynolds
On Thu, 2010-11-11 at 12:52 -0500, Kingsley Idehen wrote:
 All,
 
 As the conversation about HTTP responses evolves, I am inclined to
 believe that most still believe that:
 
 1. URL is equivalent to a URI
 2. URI is a fancier term for URI
 3. URI is equivalent to URL.
 
 I think my opinion on this matter is clear, but I am very interested
 in the views of anyone that don't agree with the following:
 
 1. URI is an abstraction for Identifiers that work at InterWeb scale
 2. A URI can serve as a Name
 3. A URI can serve as an Address
 4. A Name != Address
 5. We locate Data at Addresses
 6. Names can be used to provide indirection to Addresses i.e., Names
 can Resolve to Data.

Why would this be a matter of opinion? :) 

After all RFC3986 et al are Standards Track and have quite clear
statements on what Identifier connotes in the context of URI.
Such as:


Identifier 

  An identifier embodies the information required to distinguish
  what is being identified from all other things within its scope of
  identification.  Our use of the terms identify and identifying
  refer to this purpose of distinguishing one resource from all
  other resources, regardless of how that purpose is accomplished
  (e.g., by name, address, or context).  These terms should not be
  mistaken as an assumption that an identifier defines or embodies
  the identity of what is referenced, though that may be the case
  for some identifiers.  Nor should it be assumed that a system
  using URIs will access the resource identified: in many cases,
  URIs are used to denote resources without any intention that they
  be accessed.


Dave




Re: Is 303 really necessary?

2010-11-09 Thread Dave Reynolds
On Mon, 2010-11-08 at 22:17 +0100, Lars Heuer wrote: 
 Hi Ian,
 
 Even if I come from a slightly different camp (Topic Maps), I wonder
 if your proposal hasn't become reality already. Try to resolve
 rdf:type or rdfs:label: I think we agree that these resources describe
 abstract concepts and should not return 200 but 303. Both return 200.

Those are hash URIs, for example, the rdf:type expands to the URI:

http://www.w3.org/1999/02/22-rdf-syntax-ns#type

As it says [1] in the RDF specs:

the RDF treatment of a fragment identifier allows it to indicate a
thing that is entirely external to the document, or even to the shared
information space known as the Web. That is, it can be a more general
idea, like some particular car or a mythical Unicorn

So those are perfectly fine. Ian's proposal and the discussion here has
been entirely about URIs without fragment identifiers, so called slash
URIs.

Dave

[1] http://www.w3.org/TR/rdf-concepts/#section-fragID - though that is
an Informative rather than Normative section of the concepts document.





Re: 200 OK with Content-Location might work

2010-11-08 Thread Dave Reynolds
Hi John,

On Sun, 2010-11-07 at 15:12 +, John Sheridan wrote:

 However, three points from my perspective:
 
 1) debating fundamental issues like this is very destabilising for those
 of us looking to expand the LOD community and introduce new people and
 organisations to Linked Data. To outsiders, it makes LOD seem like its
 not ready for adoption and use - which is deadly. This is at best the
 11th hour for making such a change in approach (perhaps even 5 minutes
 to midnight?).

+1

 2) the 303 pattern isn't *that* hard to understand for newbies and maybe
 even helps them grasp LOD. Making the difference between NIRs and IRs so
 apparent, I have found to be (counter-intuitively) a big selling point
 for LOD, when introducing new people to the paradigm. Let's not be too
 harsh on 303 - it does make an important distinction very clear for new
 adopters and, in my experience, it seems to be an approach new people
 grok quite quickly and easily.

-0.5

People grok the point of normal 30x for redirection, you can have nice
stable URIs but implement them from unstable organizations. 

Tying them up with NIR v. IR, especially since no one has a clear cut
way of distinguishing IR/NIR, is a problem at least in my experience. 
In our work on local authorities self-publishing URIs/descriptions this
issue was a significant barrier.

[Though I'm not certain the current outcome (use content-location) helps
that much for that case.]

Dave





Re: Is 303 really necessary?

2010-11-08 Thread Dave Reynolds
On Mon, 2010-11-08 at 15:34 -0500, David Booth wrote:
 On Mon, 2010-11-08 at 10:11 +, Toby Inkster wrote:
  On Thu, 4 Nov 2010 13:22:09 +
  Ian Davis m...@iandavis.com wrote:
  
   http://iand.posterous.com/is-303-really-necessary
  
  Ian brings up numerous difficulties with 303 responses.
  
  The two biggest issues in my opinion are:
  
  1. 303s can be tricky to configure unless you know your
 way around the server environment you're using, and
 have sufficient permissions on the server; and
  
  2. They require an additional HTTP request to get to the
 data the client actually wants.
  
  I think that without using RDF-specific publishing platforms (think
  WordPress for Linked Data) #1 is always going to be a difficulty.
 
 Why not use a 303-redirect service such as
 http://thing-described-by.org/ or http://t-d-b.org/ ?  That makes it
 trivially easy to do 303 redirects.  

Because the domain of the URI conveys some provenance and trust
connotations, particularly when dealing with public sector bodies. That
gets lost when redirected through a third party service. For many of the
groups I deal with that would not be acceptable.

In cases where third party redirection is acceptable then there are also
PURLs which arguably provide stronger expectations of longevity.

Dave





Re: isDefinedBy and isDescribedBy, Tale of two missing predicates

2010-11-05 Thread Dave Reynolds
On Thu, 2010-11-04 at 20:58 -0400, Kingsley Idehen wrote:

 When you create hypermedia based structured data for deployment on an
 HTTP network (intranet, extranet, World Wide Web) do include a
 relation that associates each Subject/Entity (or Data Item) with its
 container/host document. A suitable predicate for this is:
 wdrs:describedBy [2] .

Ian mentioned this predicate in his post.

Looking at [1] the range of wdrs:describeBy is given as class of POWDER
documents and is a sub class of owl:Ontology which seems to make it
unsuitable as a general predicate for the purpose being discussed here.

Dave

[1] http://www.w3.org/TR/powder-dr/#semlink





Re: [Request for Input] Linked Data Specifications

2010-11-05 Thread Dave Reynolds
Hi Michael,

A good idea.

Could I request you more clearly separate the formal specifications from
the de facto community practice documents. The Change Set vocabulary, to
pick one example, doesn't really have the same standing, adoption or
level of scrutiny as the RFCs, does it?

Dave

On Fri, 2010-11-05 at 10:33 +, Michael Hausenblas wrote: 
 All,
 
 There are quite some specs beyond the core specs (HTTP, URIs, RDF) that are
 relevant to Linked Data. In order to document this, we've set up a Web page
 [1] collecting these specs. The page is primarily targeting Linked Data
 newbies but should, IMHO, also be able to offer some gems for advanced
 Linked Data folks.
 
 I'd appreciate suggestions via the ESW Wiki page [2] and hope that this is
 useful for the community.
 
 Cheers,
   Michael
 
 [1] http://linkeddata-specs.info/
 [2] 
 http://esw.w3.org/SweoIG/TaskForces/CommunityProjects/LinkingOpenData/Specif
 ications
 






Re: isDefinedBy and isDescribedBy, Tale of two missing predicates

2010-11-05 Thread Dave Reynolds
On Fri, 2010-11-05 at 07:19 -0400, Kingsley Idehen wrote:
 On 11/5/10 4:51 AM, Dave Reynolds wrote: 
  On Thu, 2010-11-04 at 20:58 -0400, Kingsley Idehen wrote:
  
   When you create hypermedia based structured data for deployment on an
   HTTP network (intranet, extranet, World Wide Web) do include a
   relation that associates each Subject/Entity (or Data Item) with its
   container/host document. A suitable predicate for this is:
   wdrs:describedBy [2] .
  Ian mentioned this predicate in his post.
  
  Looking at [1] the range of wdrs:describeBy is given as class of POWDER
  documents and is a sub class of owl:Ontology which seems to make it
  unsuitable as a general predicate for the purpose being discussed here.
  
  Dave
  
  [1] http://www.w3.org/TR/powder-dr/#semlink
  
  
  
  
 Dave,
 
 I am not saying or implying that Ian didn't say this in his post.
 These issues have been raised many times in the past by others
 (including myself), repeatedly. 

Indeed. 

I was only responding on the specific suggestion to use wdrs, not
intending any broader comment.

 Here's the key difference though, yesterday was the first time that
 these suggestions were presented as somehow being mutually exclusive
 relative to use of 303 redirection.
 
 I don't want to start another session with Ian, but here is my
 fundamental issue: 
 Fixing RDF resources doesn't have to be at the expense of 303
 redirection (mechanism for resolve. At the end of the day there are
 going to be resolvable object/entity identifiers either side of these
 predicates, if we are seeking to keep the resulting Linked Data mesh
 intact etc..
 
 dropping 303 simply didn't need to be the focal point of the
 conversation. It has nothing to do with why people have been
 publishing old school RDF resources that fail to link the container
 (rdf doc) with its structured content (triples).
 
 I hope I've made my point clear :-)

Yes but I don't think the proposal was to ban use of 303 but to add an
alternative solution, a third way :)

I have some sympathy with this. The situation I've faced several times
of late is roughly this:

Reasonable and technically skilled person new to linked data reviews the
field with the intention of trying it out and concludes:

(a) Separating URIs for Things[0] and URIs for Documents containing
assertions (data, descriptions, attribute values, whatever) about those
things make sense [1].

(b) I want my Thing URIs to resolve but I don't want to use # URIs for
reasons foo/bar/baz [2].

(c) The Tag finding [3] means that we cannot use slash URIs for Things
unless we include a 303 redirect.

(d) Ergo we must use 303.

(e) Whoops this use of 303 is proving to be a barrier to adoption for my
users, maybe I'll switch to an easier technology [4].

Clearly simply using # URIs solves this but people can be surprisingly
reluctant to go that route.

I take this discussion to be exploring the question:

Would a third alternative be possible?  People can continue to
use # URIs and to use slash URIs with 303s but would it be that
bad if we allowed people to use slash URIs for Things, without
the redirect?

The talk of dropping and deprecating I've heard has been concerned
with the TAG finding on http-range-14 (which does ban use of slash URIs
for Things and thus is a genuine, standards-body-backed, objection to
such a third way) rather than to the use of 303s by those happy to do
so.

Hope this helps rather than muddies things further.

Cheers,
Dave

[0] I'm going to trying use the terminology of Thing and Document here
rather than NIR and IR - inspired by Tim's historical note (thanks to
Andy Seaborne for point this out):
http://lists.w3.org/Archives/Public/www-tag/2009Aug/.html

[1] Note that some people conclude something more like this is a
philosophical distinction that I don't care about, I'll go hang with a
different crowd. This not the branch we're concerned with here.

[2] See for example the reasons cited in Tim's historical summary note.

[3] http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039

[4] Note that I'm in no way suggesting that 303 redirects is the only or
the biggest barrier to adoption. It just has a way of triggering
conversations with users and early adopters that tend to be
counterproductive. 




Re: Is 303 really necessary?

2010-11-05 Thread Dave Reynolds
On Fri, 2010-11-05 at 12:11 +, Norman Gray wrote: 
 Greetings,
 
 On 2010 Nov 4, at 13:22, Ian Davis wrote:
 
  http://iand.posterous.com/is-303-really-necessary
 
 I haven't been aware of the following formulation of Ian's problem+solution 
 in the thread so far.  Apologies if I've missed it, or if (as I guess) it's 
 deducible from someone's longer post.
 
 
 httpRange-14 requires that a URI with a 200 response MUST be an IR; a URI 
 with a 303 MAY be a NIR.
 
 Ian is (effectively) suggesting that a URI with a 200 response MAY be an IR, 
 in the sense that it is defeasibly taken to be an IR, unless this is 
 contradicted by a self-referring statement within the RDF obtained from the 
 URI.
 
 
 Is that about right?  That fits in with Harry's remarks about IRW, and the 
 general suspicion of deriving important semantics from the details of the 
 HTTP transaction.  Here, the only semantics derivable from the transaction is 
 defeasible.  In the absence of RDF, this is equivalent to the httpRange-14 
 finding, so might require only adjustment, rather than replacement, of 
 httpRange-14.

Very nice. That seems like an accurate and very helpful way of looking
at Ian's proposal.

Dave





RE: Domain of Dublin Core terms

2010-10-13 Thread Dave Reynolds
Thanks Andy, that's very helpful.

I had looked at the Abstract Model but the i.e. an rdfs:Resource
interpretation wasn't obvious to me from that which is why I turned to
the FAQ. 

Dave

On Tue, 2010-10-12 at 10:16 +0100, Andy Powell wrote: 
 Well... the DCMI Abstract Model [1] says that a 'described resource' is a 
 'resource' (i.e. an rdfs:Resource):
 
 resource (http://www.w3.org/2000/01/rdf-schema#Resource)
 Anything that might be identified. Familiar examples include an electronic 
 document, an image, a service (for example, today's weather report for Los 
 Angeles), and a collection of other resources. Not all resources are network 
 retrievable; for example, human beings, corporations, concepts and bound 
 books in a library can also be considered resources.
 
 so in the context of current DCMI thinking your interpretation is too narrow. 
 (I haven't looked at the FAQ but I'm not sure how well maintained it is, nor 
 whether it has been updated in line with the language used in the Abstract 
 Model).
 
 Historically, DCMI used to talk about document-like objects (DLOs) as being 
 the kind of things that DC metadata was optimised to describe. Some of this 
 legacy remains in, say, the definition of dcterms:format (which is pretty 
 horrible in any case) [2] and the DC Type vocabulary [3] and in more general 
 attitudes and practice.
 
 To make matters worse, I think there are probably a wide range of views about 
 what DC metadata can reasonably be used to describe within the DCMI community 
 - and indeed on the value of things like Linked Data :-). I tried to touch on 
 some of this in my recent talk at the ISKO Linked Data - the future of 
 knowledge organisation on the Web conference a few weeks ago [4]. One of 
 DCMI's problems is that its longevity means that there are a wide range of 
 attitudes and practices to accommodate.
 
 Overall though, I suggest that there is a general trend towards the 
 acceptance of using an appropriate mix of DC terms to describe any kind of 
 resource. If nothing else, DC terms are used to describe DC terms, which are 
 themselves conceptual :-).
 
 (Note that I was one of the authors of the Abstract Model and therefore tend 
 to use it as my reference point rather more heavily than others do. For info, 
 there is a current conversation within DCMI about the continuing need for a 
 separate DCMI Abstract Model, as opposed to simply using the RDF model.)
 
 [1] http://dublincore.org/documents/2007/06/04/abstract-model/
 [2] http://dublincore.org/documents/dcmi-terms/#terms-format
 [3] http://dublincore.org/documents/dcmi-terms/#H7
 [4] http://www.slideshare.net/andypowe11/linked-data-the-long-and-winding-road
 
 Andy
 
 --
 Andy Powell
 Research Programme Director
 Eduserv
 t: 01225 474319
 m: 07989 476710
 twitter: @andypowe11
 blog: efoundations.typepad.com
 
 www.eduserv.org.uk 
 
 -Original Message-
 From: public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org] On Behalf 
 Of Dave Reynolds
 Sent: 11 October 2010 22:54
 To: Linking Open Data
 Subject: Domain of Dublin Core terms
 
 This is a back to basics kind of question ...
 
 What sorts of entities are we happy to describe using Dublin Core Terms?
 
 The Dublin Core Abstract Model [1] talks about described resources
 which are described in the FAQ [2] as anything addressable via a
 URL ... including various collections of documents and non-electronic
 forms of media such as a museum or library archive. I've always taken
 this to mean that such resources are Information Resources in the sense
 of http-range-14, not abstract concepts. 
 
 So I've been happy using, say, dct:spatial to talk about the area
 covered by some report or some data set (c.f. its use in dcat [3]) but
 not happy to use it for, say, the area affected by some public project
 or administered by a local council.
 
 Various discussions have led me to question whether I'm being too
 restrictive here and whether the LOD general practice has evolved to use
 dcterms more broadly than that.
 
 The published schema for dcterms has no rdfs:domain declarations for the
 bulk of the properties and no class representing describable resources.
 So from a pure inference point of view using properties such as
 dct:spatial on an abstract thing like a project does no harm. 
 
 The question is whether the informal semantics or best practice
 expectations suggest avoiding this.
 
 Dave
 
 [1] http://dublincore.org/documents/2007/06/04/abstract-model/
 [2] http://dublincore.org/resources/faq/#whatisaresource
 [3]
 http://www.w3.org/egov/wiki/Data_Catalog_Vocabulary/Vocabulary_Reference#Property:_spatial.2Fgeographic_coverage
 
 
 
 
 






Domain of Dublin Core terms

2010-10-11 Thread Dave Reynolds
This is a back to basics kind of question ...

What sorts of entities are we happy to describe using Dublin Core Terms?

The Dublin Core Abstract Model [1] talks about described resources
which are described in the FAQ [2] as anything addressable via a
URL ... including various collections of documents and non-electronic
forms of media such as a museum or library archive. I've always taken
this to mean that such resources are Information Resources in the sense
of http-range-14, not abstract concepts. 

So I've been happy using, say, dct:spatial to talk about the area
covered by some report or some data set (c.f. its use in dcat [3]) but
not happy to use it for, say, the area affected by some public project
or administered by a local council.

Various discussions have led me to question whether I'm being too
restrictive here and whether the LOD general practice has evolved to use
dcterms more broadly than that.

The published schema for dcterms has no rdfs:domain declarations for the
bulk of the properties and no class representing describable resources.
So from a pure inference point of view using properties such as
dct:spatial on an abstract thing like a project does no harm. 

The question is whether the informal semantics or best practice
expectations suggest avoiding this.

Dave

[1] http://dublincore.org/documents/2007/06/04/abstract-model/
[2] http://dublincore.org/resources/faq/#whatisaresource
[3]
http://www.w3.org/egov/wiki/Data_Catalog_Vocabulary/Vocabulary_Reference#Property:_spatial.2Fgeographic_coverage







Re: PUBLINK Linked Data Consultancy

2010-10-07 Thread Dave Reynolds
On Thu, 2010-10-07 at 01:38 +0200, Sören Auer wrote: 
 On 07.10.2010 1:13, Georgi Kobilarov wrote:
  So, now the EU also takes that burden off the small linked data
  consultancies and businesses.
 
 Not at all! PUBLINK is not aimed at organizations which already 
 precisely know what they want and are willing to pay for it.
 
 It is more aimed at people in organizations who want to persuade their 
 decision makers or decision makers who need more information or a 
 showcase in order to get ultimately involved.
 
 Insofar PUBLINK rather clears the way for commercial linked data service 
 providers.

But is not working with any breadth of such providers.

I share Georgi's reservations, seems like an odd direction for EU
framework projects to take.

Dave





Re: Best Practices for Converting CSV into LOD?

2010-08-10 Thread Dave Reynolds
On Mon, 2010-08-09 at 10:37 -0600, Wood, Jamey wrote: 
 Are there any established best practices for converting CSV data into 
 LOD-friendly RDF?  For example, I would like to produce an LOD-friendly RDF 
 version of the 2001 - Present Net Generation by State by Type of Producer by 
 Energy Source CSV data at:
 
   http://www.eia.doe.gov/cneaf/electricity/epa/epa_sprdshts_monthly.html
 
 I'm attaching a sample of a first stab at this.  Questions I'm running into 
 include the following:
 
 
  1.  Should one try to convert primitive data types (particularly strings) 
 into URI references?  Or just leave them as primitives?  Or perhaps provide 
 both (with separate predicate names)?  For example, the  sample EIA data I 
 reference has two-letter state abbreviations in one column.  Should those be 
 left alone or converted into URIs?

If the code corresponds to a concept which has a useful URI to link to
then yes. 

In cases where the string is a code but there isn't an existing URI
scheme then one approach is to create a set of SKOS concepts to
represent the codes, recording the original code string using
skos:notation.

 2.  Should one merge separate columns from the original data in order to 
 align to well-known RDF types?  For example, the sample EIA data has separate 
 Year and Month columns.  Should those be merged in the RDF version so 
 that an xs:gYearMonth type can be used?

Probably. Merging is useful if you are going to query via the merged
form. In a case like year/month there could be an argument for also
keeping the separate forms as well to enable you to query by month,
independent of year.

 3.  Should one attempt to introduce some sort of hierarchical structure (to 
 make the LOD more browseable)?  The skos:related triples in the attached 
 sample are an initial attempt to do that.  Is this a good idea?  If so, is 
 that a reasonable predicate to use?  If it is a reasonable thing to do, we 
 would presumably craft these triples so that one could navigate through the 
 entire LOD (e.g. state - state/year - state/year/month - 
 state/year/month/typeOfProducer - 
 state/year/month/typeOfProducer/energySource).

Another approach is to use one of the statistics-in-RDF representations
so that you can slice by the dimensions in the data.

There is the Scovo vocabulary [1]. 

Recently a group of us have been working on an updated vocabulary for
statistics [2] based on the SDMX standard [3]. At a recent Open Data
Foundation workshop [4] we agreed to partition the SDMX-in-RDF work into
a simple Data Cube vocabulary [5] and extension vocabularies to
support particular domains such as aggregate statistics (SDMX) and maybe
eventually micro-data (DDI).

The Data Cube vocabulary is very much a work in progress but I think we
have now closed out all the main open design questions, have a draft
vocab and aim to get the initial documentation to a usable state over
the coming few weeks.

Feel free to ping me off line if you would like to follow up on this.

Dave

[1] http://semanticweb.org/wiki/Scovo
[2] http://code.google.com/p/publishing-statistical-data/
[3] http://sdmx.org/
[4] http://www.odaf.org/blog/?p=39
[5]
http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/cube.html







Re: [ANN] Uberblic Search API

2010-07-21 Thread Dave Reynolds
Hi Georgi,

Does that mean you are back on the lists? :)

Great API - congratulations!

A suggestion would be to add some form of disambiguating description to
the keyword completion. If I type Scarlet then the completion options
look something like:

   Scarlett
   Scarlett Johansson
   Scarlett Johansson
   Scarlett
   Scarlett
   Scarlett Johansson
   ...
   Scarlett Johansson

Each of the four Johansson entries seems to bring up a different display
but it's hard to work out which is the right one to use and the order is
unpredictable.

Cheers,
Dave

On Wed, 2010-07-21 at 10:48 +0200, Georgi Kobilarov wrote: 
 Hello,
 
 there's a new Uberblic Search API [1] which aims to make the life easier for
 developers who want to build tagging  search interfaces on top of the
 uberblic data repository. Or who want a Search API with just a little
 semantics for finding named entites in data sources like Wikipedia,
 Geonames, Foursquare, Musicbrainz, ...
 
 The API supports simple lookup queries, but also a bit more semantic
 queries. It's like my lookup.dbpedia.org service on steroids...
 
 
 Looking for the URI of the company Starbucks as defined in Wikipedia? 
 source:[enwikipedia] type:[uo:Company] Starbucks
 
 Or, you know, that movie with Bill Murray and Scarlett ... what's her last
 name?
 type:[uo:Film] starring:[Bill Murray] starring:[Scarlett]
 
 The API supports type-ahead / autocomplete interfaces as well.
 So if you want an autocomplete-enabled search box for Bill Murray movies,
 just prefix your query with 
 type:[uo:Film] starring:[Bill Murray] 
 
 Try it out by copying that query into the search box at
 http://platform.uberblic.org and start typing movie names... 
 
 
 Read more about the new API and let me know what you think:
 http://uberblic.org/2010/07/uberblic-search-api-just-enough-semantics/
 
 
 Best,
 Georgi
 
 
 [1] http://uberblic.org/developers/apis/search/
 
 --
 Georgi Kobilarov
 Uberblic Labs Berlin
 http://kobilarov.com
 
 







Re: FOAF DL

2010-07-16 Thread Dave Reynolds
Looks interesting.

In the description you say (1) foaf:mbox_sha1sum, foaf:jabberID,
foaf:aimChatID, foaf:icqChatID, foaf:yahooChatID and foaf:msnChatID are
not owl:InverseFunctionalProperties anymore; instead, they are defined
as owl:Keys for foaf:Agents, which is practically the same

I agree that making them owl:Keys is the only option for DL but the
comment practically the same is maybe overstating it.

My understanding was that the semantics of Keys [1] only applies to
named individuals and so isn't effective on anonymous individuals (which
is a common use case in FOAF). Is that correct?

Dave

[1] http://www.w3.org/TR/owl2-direct-semantics/#Keys

On Fri, 2010-07-16 at 12:16 +0100, Antoine Zimmermann wrote: 
 Dear all,
 
 
 I know that the compatibility of FOAF with OWL DL has been discussed a 
 lot in the past (and still sometimes surfaces again).  However, I'm 
 wondering, would it be reasonable to provide a DL version of FOAF in 
 complement of the official FOAF ontology?
 More generally, wouldn't it be reasonable to provide alternative 
 versions of an ontology?  Think of XHTML: there are three different XML 
 Schemas for XHTML [1].  One could imagine alternative versions like FOAF 
 (Full), FOAF-DL, FOAF-lite...
 
 Anyway, I did it: I've made a FOAF-DL ontology which modifies the FOAF 
 ontology such that (1) it is in OWL 2 DL and (2) it maximally preserves 
 inferences of the original FOAF ontology [2].
 
 Interestingly, FOAF-DL is an OWL 2 RL ontology (in a nutshell, OWL 2 RL 
 is a subset of OWL 2 DL with low computational complexity and that is 
 compatible with rule-based inference engine).
 
 You may notice that there are strange annotation properties for this 
 ontology:
 
 owl:Ontology rdf:about=http://purl.org/az/foaf#;
...
yoda:preferredVersion rdf:resource=http://xmlns.com/foaf/0.1//
...
 /owl:Ontology
 
 The Yoda vocabulary [3] is used to relate alternative versions of an 
 ontology. Here, it is said that there is a preferred version, which is 
 the official FOAF ontology.
 
 Critiques to any of the previous comments are welcome.
 
 
 [1] http://www.w3.org/TR/xhtml1-schema/#schemas
 [2] The FOAF-DL ontology. http://purl.org/az/foaf
 [3] Yoda: A Vocabulary for Linking Alternative Specifications of a 
 Vocabulary. http://purl.org/NET/yoda
 
 
 Regards,





Re: FOAF DL

2010-07-16 Thread Dave Reynolds
On Fri, 2010-07-16 at 16:17 +0100, Antoine Zimmermann wrote: 
 Beware, technical stuff follows.
 
 
 
 Le 16/07/2010 13:07, Dave Reynolds a écrit :
  Looks interesting.
 
  In the description you say (1) foaf:mbox_sha1sum, foaf:jabberID,
  foaf:aimChatID, foaf:icqChatID, foaf:yahooChatID and foaf:msnChatID are
  not owl:InverseFunctionalProperties anymore; instead, they are defined
  as owl:Keys for foaf:Agents, which is practically the same
 
  I agree that making them owl:Keys is the only option for DL but the
  comment practically the same is maybe overstating it.
 
  My understanding was that the semantics of Keys [1] only applies to
  named individuals and so isn't effective on anonymous individuals (which
  is a common use case in FOAF). Is that correct?
 
 Yes, you are correct.
 
 Here, I made a simplifying short cut but saying practically the same. 
 In most cases, a blank node can be simply considered as a named 
 individual, using the blank node ID (regardless of whether it is 
 specified in the serialisation or internally represented) as a name 
 for the individual.
 In this case, the un-named individual are only those that exists because 
 of inferences but which have no identifier at all (for instance, when 
 using the OWL construct owl:someValueFrom.
 
 Such shortcuts are practically used in reasoners to deduce things about 
 blank nodes in the same way they deduce things about URIs.
 
 
 Using this approach on the following data:
 
 
 foaf:yahooChatID a owl:DatatypeProperty ;
   rdfs:domain foaf:Agent .
 foaf:Agent owl:hasKey ( foaf:yahooChatID ).
 _:bnode1 foaf:yahooChatID xyz .
 _:bnode2 foaf:yahooChatID xyz .
 
 
 one can infer:
 
 
 _:bnode1 owl:sameAs _:bnode2 .
 
 
 When the serialisation does not specify a name for a blank node, their 
 is still an internal name somewhere.  For instance:
 
 
 [ a :Person ] foaf:yahooChatID xyz ;
 foaf:firstName John .
 [ a :Person ] foaf:yahooChatID xyz ;
 foaf:lastName Doe .
 
 allows one to infer that there is one person with first name John and 
 last name Doe (but its local identifier is only known to the reasoner).
 
 An example where the inference would not hold is as follows:
 
 :chatID a owl:DatatypeProperty ;
  rdfs:domain :Agent .
 :Agent owl:hasKey ( :chatID ).
 :hasFather a owl:ObjectProperty, owl:FunctionalProperty .
 :john :chatId xyz .
 :bob a [ owl:Restriction ;
   owl:onProperty :hasFather ;
   owl:allValuesFrom [ a owl:Restriction;
   owl:onProperty :chatId;
   owl:hasValue xyz ]
 ] .
 
 from this, we cannot conclude that John is the father of Bob, although 
 Bob has a father (un-named) whose :chatId is exactly the same as John.
 If :chatId was inverse functional, we could conclude it. However, I said 
 practically the same because this case is unlikely to occur in practice.
 
 
 Hope it's clear enough. For the specs for keys, see [1,2,3].

Yes that's clear and reasonable. 

I had looked at those parts of the specs but the phrasing in [1] seemed
to rule out blank nodes since the RDF semantics [3] specifically states
that blank nodes are existentially quantified variables:

are treated as simply indicating the existence of a thing, without
using, or saying anything about, the name of that thing. (This is not
the same as assuming that the blank node indicates an 'unknown' URI
reference; for example, it does not assume that there is any URI
reference which refers to the thing.

However, you are basically saying that the Skolemization lemma applies
here, which makes sense.

Thanks,
Dave

[3] http://www.w3.org/TR/rdf-mt/#unlabel


 [1] OWL 2 Web Ontology Language - Structural Specification and 
 Functional-Style Syntax, Sect.9.5 Keys. 
 http://www.w3.org/TR/owl2-syntax/#Keys
 [2] OWL 2 Web Ontology Language - New Features and Rationale, Sect.2.2.6 
 F9: Keys. http://www.w3.org/TR/owl2-new-features/#F9:_Keys
 [3] OWL 2 Web Ontology Language - Direct Semantics, Sect.2.3.5 Keys. 
 http://www.w3.org/TR/owl2-semantics/#Keys
 
 
 Regards,
 
  Dave
 
  [1] http://www.w3.org/TR/owl2-direct-semantics/#Keys
 
  On Fri, 2010-07-16 at 12:16 +0100, Antoine Zimmermann wrote:
  Dear all,
 
 
  I know that the compatibility of FOAF with OWL DL has been discussed a
  lot in the past (and still sometimes surfaces again).  However, I'm
  wondering, would it be reasonable to provide a DL version of FOAF in
  complement of the official FOAF ontology?
  More generally, wouldn't it be reasonable to provide alternative
  versions of an ontology?  Think of XHTML: there are three different XML
  Schemas for XHTML [1].  One could imagine alternative versions like FOAF
  (Full), FOAF-DL, FOAF-lite...
 
  Anyway, I did it: I've made a FOAF-DL ontology which modifies the FOAF
  ontology such that (1) it is in OWL 2 DL and (2) it maximally preserves
  inferences of the original FOAF ontology [2].
 
  Interestingly, FOAF-DL is an OWL 2 RL ontology

Re: Show me the money - (was Subjects as Literals)

2010-07-11 Thread Dave Reynolds
On Thu, 2010-07-01 at 22:44 -0500, Pat Hayes wrote: 
 Jeremy, your argument is perfectly sound from your company's POV, but  
 not from a broader perspective. Of course, any change will incur costs  
 by those who have based their assumptions upon no change happening.  
 Your company took a risk, apparently. IMO it was a bad risk, as you  
 could have implemented a better inference engine if you had allowed  
 literal subjects internally in the first place, but whatever. 

I've tried to be quiet but I couldn't let this dig slide by ... 

Jena, which Jeremy's software is based on, *does* allow literals as
subjects internally (the Graph SPI) and the rule reasoners *do* work
with generalized triples just as most such RDF reasoners do. However, we
go to some lengths to stop the generalized triples escaping. So the lack
of subjects as triples in the exchange syntax or the publicly
standardized model has had no detrimental impact on our ability to work
with them internally.

Dave

[Apologies if this point has already been made down thread, I'm only 303
messages in and have 242 left to scan :)]






Re: Organization ontology

2010-06-08 Thread Dave Reynolds
On Mon, 2010-06-07 at 22:27 +0100, William Waites wrote: 
 On 10-06-03 16:04, Dave Reynolds wrote:
  It would be great if you could suggest a better phrasing of the
  description of a FormalOrganization that would better encompass the
  range of entities you think should go there? Or are you advocating that
  the distinction between a generic organization and a externally
  recognized semi-autonomous organization is not a useful one?

 
 Reading the rest of your mail, I think the latter. Do we really need
 FormalOrganisation at all? Can we not just have Organisation and
 then some extension vocabulary could have subclasses for different
 flavours of partnerships, corporations, unincorporated associations
 etc. as needed?

Indeed, as it says in the documentation, almost all Organization
categorization is left to extension vocabularies and we deliberately
avoided including distinctions such as partnerships, corporations etc
since they are so jurisdiction-specific.

The only categorization we included is this separation between
externally recognized entities and internal units - extensions and
applications are free to by-pass that and directly exploit
org:Organization. 

 I don't think the distinction is useless as such, perhaps that it is
 underspecified and Formal is ambiguous.

I agree there's an element of underspecification in there. However,
sufficiently many of the existing vocabularies that we surveyed have a
similar separation that it seemed valuable to include it, if only to
help with mapping.

Over time, if people apply org but find this distinction unhelpful or
confusing we could deprecate it. The aim here was to get something
workable (not necessarily perfect) done quickly and make it available.
If org proves useful then it can improved in response to application
experience.

Cheers,
Dave






Re: Organization ontology

2010-06-08 Thread Dave Reynolds
On Tue, 2010-06-08 at 01:03 +0300, Emmanouil Batsis (Manos) wrote:

 Sorry for jumping in. I was thinking that
 
 a) the way i get FormalOrganization, it could as well be called 
 LegalEntity to be more precise.

Not quite, there are other LegalEntities that are not Organizations.

The LegalEntity notion could be made explicit:

 org:FormalOrganization 
 subClassOf org:Organization AND ns:LegalEntity

This is better modelling because the primitive concepts are now explicit
and the nature of org:FormalOrganization as a derived concept is
clear.  

I nearly did it that way but my concern was that putting LegalEntity
into org: would open up a whole can of worms about needing richer
modelling of the notion of LegalEntity (e.g. Jurisdiction etc). That
would be off topic for the focused goals and requirements for org.

 b) what happens when organizations change legal status?

Pretty much any aspect of organizations change over time :) In the
context of this work there are already separate approaches to handling
versioning and change so org: defers to those. Though, in some
applications you do want to explicitly represent the historical trace of
those changes hence the inclusion of OPMV via org:ChangeEvent to give a
minimal foundation for that.

Cheers,
Dave





Re: Organization ontology

2010-06-07 Thread Dave Reynolds
On Mon, 2010-06-07 at 09:34 +0100, Ian Davis wrote: 
 On Tue, Jun 1, 2010 at 8:50 AM, Dave Reynolds
 dave.e.reyno...@googlemail.com wrote:
  We would like to announce the availability of an ontology for description of
  organizational structures including government organizations.
 
 
 Congratulations on the publication of this ontology! I've added it to
 Schemapedia here:
 
 http://schemapedia.com/schemas/org

Thanks Ian. 

 I noticed a small semantic typo in the example at the end of section
 3. skos:preferredLabel should be skos:prefLabel

Fixed.

Cheers,
Dave






Re: Organization ontology

2010-06-06 Thread Dave Reynolds
Thanks to everyone for the good feedback and comments.

I've made some small changes to the ontology based on all the feedback.
These are largely small bug fixes and (hopefully) improvements in
documentation. 

The significant changes include: 
  * addition of a transitive version of org:subOrganizationOf 
  * improved mapping to foaf 
  * reversed direction of org:resultingOrganzation (to be
org:resultedFrom) for corrected compatibility with OPMV 

The full set of changes are listed in the updated documentation [1].

Dave

[1] http://www.epimorphics.com/public/vocabulary/org.html#changes

On Tue, 2010-06-01 at 08:50 +0100, Dave Reynolds wrote: 
 We would like to announce the availability of an ontology for 
 description of organizational structures including government organizations.
 
 This was motivated by the needs of the data.gov.uk project. After some 
 checking we were unable to find an existing ontology that precisely met 
 our needs and so developed this generic core, intended to be extensible 
 to particular domains of use.
 
 The ontology is documented at [1] and some discussion on the 
 requirements and design process are at [2].
 
 W3C have been kind enough to offer to host the ontology within the W3C 
 namespace [3]. This does not imply that W3C endorses the ontology, nor 
 that it is part of any standards process at this stage. They are simply 
 providing a stable place for posterity.
 
 Any changes to the ontology involving removal of, or modification to, 
 existing terms (but not necessarily addition of new terms) will be 
 announced to these lists. We suggest that any discussion take place on 
 the public-lod list to avoid further cross-posting.
 
 Dave, Jeni, John
 
 [1] http://www.epimorphics.com/public/vocabulary/org.html
 [2] 
 http://www.epimorphics.com/web/category/category/developers/organization-ontology
 [3] http://www.w3.org/ns/org# (available in RDF/XML, N3, Turtle via 
 conneg or append .rdf/.n3/.ttl)





Re: Organization ontology

2010-06-03 Thread Dave Reynolds
On Thu, 2010-06-03 at 09:29 -0400, Bob DuCharme wrote:
 Is any sample instance data available, whether it's using real or fake 
 organizations?

Not yet, but there will be. 

Dave





Re: Organization ontology

2010-06-03 Thread Dave Reynolds
On Thu, 2010-06-03 at 14:07 +0100, William Waites wrote:
 On 10-06-03 09:01, Dan Brickley wrote:
  I don't find anything particularly troublesome about the org: vocab on
  this front. If you really want to critique culturally-loaded
  ontologies, I'd go find one that declares class hierarchies with terms
  like 'Terrorist' without giving any operational definitions...

 
 I must admit when I looked at the org vocabulary I had a feeling
 that there were some assumptions buried in it but discarded a
 couple of draft emails trying to articulate it.
 
 I think it stems from org:FormalOrganization being a thing that is
 legally recognized and org:OrganizationalUnit (btw, any
 particular reason for using the North American spelling here?)
 being an entity that is not recognised outside of the FormalOrg

org:Organization is useful directly, the two subClasses do not form a
covering they do not exhaust the space. They are just useful
distinctions in a broad variety of applications - as indicated by their
presence in a number of the ontologies we surveyed [2]. 

On spelling, to quote from the public design notes [1]:

Let's get this one out of the way - are we organized or organised?
American English demands -ize but both are correct in British English;
-ize is preferred by the OED (the Oxford spelling); -ise is preferred
by Fowler, The Times and is 50% more common in the British National
Corpus. If we want to strive for broad uptake then picking one which is
acceptable for all versions of English is the obvious choice so we'll go
for -ize. After all, being on the same side as the OED can't be all
bad.

 Organisations can become recognised in some circumstances
 despite never having solicited outside recognition from a state --
 this might happen in a court proceeding after some collective
 wrongdoing. Conversely you might have something that can
 behave like a kind of organisation, e.g. a class in a class-action
 lawsuit without the internal structure present it most organisations.

The ontology doesn't talk about having solicited recognition so I
don't think that distinction is relevant here.

It is up to you, in applying this simple core ontology whether the
distinction between general org:Organization and org:FormalOrganization
is useful to your application. The nature of the formality is left
fairly open but if it is too constraining then model at org:Organization
level.

 Is a state an Organisation?

Yes, whether it is one that you would usefully model using this is a
different question.

 Organisational units can often be semi-autonomous (e.g. legally
 recognised) subsidiaries of a parent or holding company. What
 about quangos or crown-corporations (e.g. corporations owned
 by the state). They have legal recognition but are really like
 subsidiaries or units.

Certainly, there is no requirement that FormalOrganzations can't have
other FormalOrganizations as subOrganizations. The containment hierarchy
is very open specifically to allow just that sort of structure.

 Some types of legally recognised organisations don't have a
 distinct legal personality, e.g. a partnership or unincorporated
 association so they cannot be said to have rights and responsibilities,
 rather the members have joint (or joint and several) rights and
 responsibilities. This may seem like splitting hairs but from a
 legal perspective its an important distinction at least in some
 legal environments. The description provided in the vocabulary
 is really only true for corporations or limited companies.

[Aside: I believe that in the UK Partnerships do have some legal
recognition, just as Sole Traders do. Partners also have joint and
several responsibilities but the Partnership itself is a recognized
entity for some purposes. ]

It would be great if you could suggest a better phrasing of the
description of a FormalOrganization that would better encompass the
range of entities you think should go there? Or are you advocating that
the distinction between a generic organization and a externally
recognized semi-autonomous organization is not a useful one?

 I think the example, eg:contract1 is misleading since this is
 an inappropriate way to model a contract. A contract has two
 or more parties. A contract might include a duty to fill a role
 on the part of one party but it is not normally something that
 has to do with membership

You are reading way too much into the choice of spelling of a URI! The
example is simply to illustrate how the vocabulary should be used to
bind a person to an organization in some form of role. I could have used
a bNode there. There is nothing in there to model Contracts with a big-C
- that would be a whole other ball game! I'll change the name to avoid
such confusion.

 Membership usually has a particular meaning as applied to
 cooperatives and not-for-profits. They usually wring their hands
 extensively about what exactly membership means. This concept
 normally doesn't apply to other types of 

Re: Organization ontology

2010-06-03 Thread Dave Reynolds
On Thu, 2010-06-03 at 12:41 -0400, Bob DuCharme wrote:
 Dave,
 
 Does this mean that no sample data has been created yet, or that
 samples used in the course of development are not data that you are
 free to share? 

Given the rather ... short ... timescale we were working under the
sketchy examples used in the course of development are not in a fit
state to publish as examples of how to do things.

There are several strands of work going on applying and specializing the
ontology to real data and that will, I hope, result in publishable
examples soon.

Possibly, given that this work seems to have struck a chord with people,
it might we worth generating a worked example sooner that isn't
encumbered by the quality and completeness requirements that the real
data has. Will think about that.

Cheers,
Dave





Re: Organization ontology

2010-06-02 Thread Dave Reynolds
On Wed, 2010-06-02 at 17:06 +1200, Stuart A. Yeates wrote:
 On Tue, Jun 1, 2010 at 7:50 PM, Dave Reynolds
 dave.e.reyno...@googlemail.com wrote:
  We would like to announce the availability of an ontology for description of
  organizational structures including government organizations.
 
  This was motivated by the needs of the data.gov.uk project. After some
  checking we were unable to find an existing ontology that precisely met our
  needs and so developed this generic core, intended to be extensible to
  particular domains of use.
 
  [1] http://www.epimorphics.com/public/vocabulary/org.html
 
 I think this is great, but I'm a little worried that a number of
 Western (and specifically Westminister) assumptions may have been
 built into it.

Interesting. We tried to keep the ontology reasonably neutral, that's
why, for example, there is no notion of a Government or Corporation.

Could you say a little more about the specific Western  Westminster
assumptions that you feel are built into it?

We do have the notion of a Head role and corresponding headOf
relation (because it is such a common notion and part of our competency
questions) but there are no cardinality constraints and no requirement
that any specific organizational structure support that role.

 What would be great would be to see a handful of different
 organisations (or portions of them) from different traditions
 modelled. Maybe:
 * The tripartite system at the top of US government, which seems
 pretty complex to me, with former Presidents apparently retaining some
 control after they leave office

Control is a different issue from organizational structure. This
ontology is not designed to support reasoning about authority and
governance models. There are Enterprise Ontologies that explicitly model
authority, accountability and empowerment flows and it would be possible
to create a generic one which bolted alongside org but org is not such a
beast :)

Dave





Organization ontology

2010-06-01 Thread Dave Reynolds
We would like to announce the availability of an ontology for 
description of organizational structures including government organizations.


This was motivated by the needs of the data.gov.uk project. After some 
checking we were unable to find an existing ontology that precisely met 
our needs and so developed this generic core, intended to be extensible 
to particular domains of use.


The ontology is documented at [1] and some discussion on the 
requirements and design process are at [2].


W3C have been kind enough to offer to host the ontology within the W3C 
namespace [3]. This does not imply that W3C endorses the ontology, nor 
that it is part of any standards process at this stage. They are simply 
providing a stable place for posterity.


Any changes to the ontology involving removal of, or modification to, 
existing terms (but not necessarily addition of new terms) will be 
announced to these lists. We suggest that any discussion take place on 
the public-lod list to avoid further cross-posting.


Dave, Jeni, John

[1] http://www.epimorphics.com/public/vocabulary/org.html
[2] 
http://www.epimorphics.com/web/category/category/developers/organization-ontology
[3] http://www.w3.org/ns/org# (available in RDF/XML, N3, Turtle via 
conneg or append .rdf/.n3/.ttl)




Re: Organization ontology

2010-06-01 Thread Dave Reynolds
On Tue, 2010-06-01 at 09:26 +0100, Michael Hausenblas wrote:
 Dave,
 
  We would like to announce the availability of an ontology for
  description of organizational structures including government organizations.
 
 Brilliant! I submitted it now to Sindice [1] and 'registered' the org prefix
 in prefix.cc [2]

Thanks Michael.

  - you might want to support it by voting it up ;)

Done :)

Dave





Re: Organization ontology

2010-06-01 Thread Dave Reynolds
On Tue, 2010-06-01 at 11:04 +0200, Christophe Guéret wrote:
 On 06/01/2010 10:26 AM, Michael Hausenblas wrote:
  Dave,
 
 
  We would like to announce the availability of an ontology for
  description of organizational structures including government 
  organizations.
   
  Brilliant! I submitted it now to Sindice [1] and 'registered' the org prefix
  in prefix.cc [2] - you might want to support it by voting it up ;)
 
  Cheers,
 Michael
 
  [1]
  http://sindice.com/search?q=domain%3Awww.w3.org+Core+organization+ontologyq
  t=term
  [2] http://prefix.cc/org
 
 
 Nice. I've added it to CKAN: http://www.ckan.net/package/org_ontology

Thanks.

Great so see how easy it is to get such a vocabulary registered these
days. Just mention it here and people leap to help you make it more
widely discoverable!

Dave





Re: Organization ontology

2010-06-01 Thread Dave Reynolds
On Tue, 2010-06-01 at 10:37 +0100, Damian Steer wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 01/06/10 08:50, Dave Reynolds wrote:
  We would like to announce the availability of an ontology for
  description of organizational structures including government
  organizations.
 
 Looks good Dave.
 
 This is fairly close to AIISO [1], which I'm using for our university
 structure. I'll ping them to suggest adding subproperty mappings.

Ah. I missed that one, despite having tried all the ontology search
tools I could think of. Thanks for pointing it out.

  Any changes to the ontology involving removal of, or modification to,
  existing terms (but not necessarily addition of new terms) will be
  announced to these lists. We suggest that any discussion take place on
  the public-lod list to avoid further cross-posting.
 
 Suggestion: skos provides property and propertyTransitive [2] as a
 transitive variant. I find this pattern useful for expressing the ground
 facts (dept unitOf faculty) and woolier inferences for navigation (dept
 unitOfTransitive univ).

Yes, that's a good suggestion. I've put that on list to add.

Cheers,
Dave





Re: Organization ontology

2010-06-01 Thread Dave Reynolds
Hi Bernard,

On Tue, 2010-06-01 at 17:03 +0200, Bernard Vatant wrote:
 Hi Dave
 
 Great resource indeed. One remark, one suggestion, and one question :)
 
 Remark : Just found out what seems to be a mistake in the N3 file.
 
 org:role a owl:ObjectProperty, rdf:Property;
 rdfs:label role@en;
 rdfs:domain org:Membership;
 rdfs:range  foaf:Agent;
 ...
 
 I guess one should read :rdfs:range  org:Role

Oops, thanks, will get that fixed shortly (hopefully tonight or
tomorrow).

 Suggestion : I always feel uneasy with having class and property just
 distinct by upper/lower case. Suggest to change the property to
 org:hasRole

Names are always hard! 

Some people have commented that I should just use nouns (e.g. see
comments on [1]). My rationale has been that some relations (e.g.
unitOf, subOrganizationOf) really need to have a direction indicated and
so use phrases for those. Then for things that are clearly attributes
use simple nouns. Other cases are grey. I've thought of the properties
of org:Membership as being attributes of an n-ary relation and so gone
for nouns there. This helps to avoid confusion with the direct relations
- if I used org:hasRole then I ought to use org:hasMember which would
clash with the short cut use of org:memberOf.

 Question : Will RDF-XML file available at some point?

It is. Use content negotiation:

  curl -H Accept: application/rdf+xml http://www.w3.org/ns/org#

or point your browser at http://www.w3.org/ns/org.rdf

Cheers,
Dave


[1]
http://www.epimorphics.com/web/wiki/organization-ontology-second-draft#comment-60






Re: Java Framework for Content Negotiation

2010-05-20 Thread Dave Reynolds

On 20/05/2010 11:03, Story Henry wrote:

There is the RESTlet framework http://www.restlet.org/


There's also Jersey [1] and, for a minimalist solution to just the 
content matching piece see Mimeparse [2].


Dave

[1] https://jersey.dev.java.net/
[2] http://code.google.com/p/mimeparse/


On 20 May 2010, at 10:49, Angelo Veltens wrote:


Hello,

I am just looking for a framework to do content negotiation in java. Currently I am 
checking the HttpServletRequest myself quickdirty. Perhaps someone can 
recommend a framework/library that has solved this already.

Thanks in advance,
Angelo









Re: Announce: Linked Data Patterns book

2010-04-06 Thread Dave Reynolds

Hi Leigh,

On 06/04/2010 16:10, Leigh Dodds wrote:

Hi folks,

Ian Davis and I have been working on a catalogue of Linked Data
patterns which we've put on-line as a free book. The work is licensed
under a Creative Commons attribution license.

This is is still a very early draft but already contains 30 patterns
covering identifiers, modelling, publishing and consuming Linked Data.

http://patterns.dataincubator.org

More background at [1]. We'd be interested to hear your comments, and
hope that it can become a useful resource for the growing community of
practitioners.


Looks like a great start, well done!

It would be nice to see some reference to, and inclusion of, the 
patterns than came out of SWBP. E.g. you currently seem to have one 
variant of the three n-ary relation patterns and I couldn't spot the 
classes-as-subjects pattern (but maybe didn't look hard enough).


Might also want to reference the ontology design patterns portal [1] 
which has some patterns relevant to the modelling section.


Any plans to make this a wiki so other people can contribute or at least 
comment?


Dave

[1] http://ontologydesignpatterns.org/wiki/Main_Page



Re: AW: Linked Data API

2010-02-25 Thread Dave Reynolds

Hi Chris,

Yes, I think there is some relationship to Pubby, but some differences too.

As I understand it Pubby is mostly about serving up the individual 
resources REST style. The API-in-search-of-a-cool-name does support that 
but a key part of it is about serving up paged, filtered lists of of 
resources.


This simplified query mechanism, along with the JSON and XML formats, 
gives us easy programmatic access to the data sets for non-SPARQL, 
non-RDF developers.


What would you think about extending Pubby to support the proposed JSON 
and simple XML formats? Could be cool.


Dave



On 25/02/2010 17:08, Chris Bizer wrote:

Hi Leigh,

sounds like an interesting alternative to Pubby
(http://www4.wiwiss.fu-berlin.de/pubby/) if I understand it right.

Cheers,

Chris



-Ursprüngliche Nachricht-
Von: public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org] Im

Auftrag

von Leigh Dodds
Gesendet: Donnerstag, 25. Februar 2010 12:47
An: Linking Open Data
Cc: Jeni Tennison; Dave Reynolds
Betreff: Linked Data API

Hi all,

Yesterday, at the 2nd Linked Data London Meetup, Dave Reynolds, Jeni
Tennison and myself ran a workshop introducing some work we've been
doing around a Linked Data API.

The API is intended to be a middle-ware layer that can be deployed
in-front of a SPARQL endpoint, providing the ability to create a
RESTful data access layer for accessing the RDF data contained in the
triple store. The middle-ware is configurable, and is intended to
support a range of different access patterns and output formats. Out
of the box the system provides delivery of the standard range of RDF
serialisations, as well as simple JSON and XML serializations for
descriptions of lists of resources. The API essentially maps
parameterized URLs to underlying SPARQL queries, mediating the content
negotiation of the results into a suitable format for the client.

The current draft specification is at:

http://purl.org/linked-data/api/spec

And there is now a mailing list available at:

http://code.google.com/p/linked-data-api/

We'd be interested in your feedback. Please join the list if you're
interested in discussing the API specification or collaborating on the
work and implementations.

Cheers,

L.

--
Leigh Dodds
Programme Manager, Talis Platform
Talis
leigh.do...@talis.com
http://www.talis.com







Re: Linked Data API

2010-02-25 Thread Dave Reynolds

On 25/02/2010 18:11, Nathan wrote:

Leigh Dodds wrote:

Hi all,

Yesterday, at the 2nd Linked Data London Meetup, Dave Reynolds, Jeni
Tennison and myself ran a workshop introducing some work we've been
doing around a Linked Data API.

The API is intended to be a middle-ware layer that can be deployed
in-front of a SPARQL endpoint, providing the ability to create a
RESTful data access layer for accessing the RDF data contained in the
triple store. The middle-ware is configurable, and is intended to
support a range of different access patterns and output formats. Out
of the box the system provides delivery of the standard range of RDF
serialisations, as well as simple JSON and XML serializations for
descriptions of lists of resources. The API essentially maps
parameterized URLs to underlying SPARQL queries, mediating the content
negotiation of the results into a suitable format for the client.

The current draft specification is at:

http://purl.org/linked-data/api/spec


If I may make a suggestion; I'd like you to consider including the
formed SPARQL query in with the return; so that developers can get used
to the language and see how similar to existing SQL etc etc..


Absolutely. The notion (and current implementation) is that the returned 
results gives a reference to a metadata resource which in turn includes 
the sparql query and the endpoint configuration. Will check if that is 
clear in the current draft of the spec write up.



For all this middle-ware is needed in the interim and provides access to
the masses, surely an extra chance to introduce developers to linked
data / rdf / sparql is a good thing?


Exactly. The API helps developers get started but we are trying to keep 
the essence of the RDF model intact so that they can move onto SPARQL 
and full stack as they get comfortable with it.


Dave




Re: Linked Data API

2010-02-25 Thread Dave Reynolds

Nathan wrote:

Dave Reynolds wrote:

On 25/02/2010 18:11, Nathan wrote:

Leigh Dodds wrote:

Hi all,

Yesterday, at the 2nd Linked Data London Meetup, Dave Reynolds, Jeni
Tennison and myself ran a workshop introducing some work we've been
doing around a Linked Data API.

The API is intended to be a middle-ware layer that can be deployed
in-front of a SPARQL endpoint, providing the ability to create a
RESTful data access layer for accessing the RDF data contained in the
triple store. The middle-ware is configurable, and is intended to
support a range of different access patterns and output formats. Out
of the box the system provides delivery of the standard range of RDF
serialisations, as well as simple JSON and XML serializations for
descriptions of lists of resources. The API essentially maps
parameterized URLs to underlying SPARQL queries, mediating the content
negotiation of the results into a suitable format for the client.

The current draft specification is at:

http://purl.org/linked-data/api/spec

If I may make a suggestion; I'd like you to consider including the
formed SPARQL query in with the return; so that developers can get used
to the language and see how similar to existing SQL etc etc..

Absolutely. The notion (and current implementation) is that the returned
results gives a reference to a metadata resource which in turn includes
the sparql query and the endpoint configuration. Will check if that is
clear in the current draft of the spec write up.


For all this middle-ware is needed in the interim and provides access to
the masses, surely an extra chance to introduce developers to linked
data / rdf / sparql is a good thing?

Exactly. The API helps developers get started but we are trying to keep
the essence of the RDF model intact so that they can move onto SPARQL
and full stack as they get comfortable with it.


thinking out-loud here; I wonder what would happen if you created a REST
api like you have, that redirects to the SPARQL endpoint w/ query and
that obviously returns SPARQL+JSON / SPARQL+RDF ..? then libraries like
ARC and rdflib, jena etc can be used by the developers; essentially just
a little introduction protocol and offloading all the hard work on to
these fantastic libraries. 


Indeed the API implementations redirect to the SPARQL endpoints and in 
the case of our own Java implement it does, of course, build on Jena.


However, it is possible for a consumer of the published data to work 
without needing to use any of those toolkits. They can get started with 
just simple web GETs, with easy to use parameters, and can consume the 
data with standard JSON and XML tools. As they get comfortable with the 
data, the generic data model and the specific vocabularies involved in 
the sources then migrating to the full power of SPARQL and RDF APIs 
should be easier. The API exposes you to the structure of the data and 
makes it easy to play around with, giving you a motivation to get to 
grips with the more powerful tools.



Which in turn would also be a further
introduction to Linked Data. Further SPARQL+JSON is really easy to
decode and use 


Sure, if you already understand the modelling behind it. However, to 
someone doesn't (and isn't yet motivated to do so) it can appear a 
little arcane.


We were also very keen to ensure the essence of RDF is preserved through 
the API, in particular the resource/thing centric and schemaless nature 
of it. A danger of saying just use SPARQL+JSON is not just the 
learning curve but the rigidity of it. SPARQL Describe and LCBD are 
wonderful things and the API makes it easy to get descriptions and 
discover what is possible when your data linking isn't limited by a 
rigid schema.


Dave




Re: Creating JSON from RDF

2009-12-17 Thread Dave Reynolds

Dave Reynolds wrote:


Jeni Tennison wrote:


I don't know where the best place is to work on this: I guess at some 
point it would be good to set up a Wiki page or something that we 
could use as a hub for discussion?


I'd suggest setting up a Google Code area and making anyone who is 
interested a committer. That gives us a Wiki but also hosting for 
associated code for generating/navigating the format. I'd be happy to 
set one up.


Now done. We've opened a Google Code area linked-data-api and 
summarized both the aims/assumptions [1] and the various design issues 
and suggestions that have come out of the list discussions so far [2].


The wiki is public access for reading and commenting.
Anyone who is interested in joining in with working on this please ping 
either Jeni or myself and we'll add you as a committer.


Cheers,
Dave

[1] http://code.google.com/p/linked-data-api/wiki/AssumptionsAndGoals
[2] http://code.google.com/p/linked-data-api/wiki/DesignQuestions



Re: APIs and Lists

2009-12-15 Thread Dave Reynolds

Hi Jeni,

Sorry to be slow to reply to this one.

Jeni Tennison wrote:

Dave (Reynolds) raised the point that lists are an integral part of most 
APIs. This is another thing that we know we need to address in the UK 
linked government data project, but are unsure as yet how best to do so.



This is a bit of a brain dump of my current thinking, which is mostly 
packed with uncertainty! I'd be very grateful for any thoughts, 
guidance, pointers that you have.


I was getting bogged down in try do my own dump in line. So instead I've 
put up a blog post [1] about a pattern we ended up using for this. This 
is in the spirit of sharing experiences rather than claiming it as a one 
true pattern.



# Defining List Membership #


As noted in [1] we decided to use SPARQL SELECT to define the list 
membership.


This enables you to order the lists which we often found important (the 
rdf:Bag/rdfs:member pattern is unordered of course).


It also subsumes the other choices. If you want to define membership 
explicitly in the data store then the SELECT may be trivial:


   SELECT ?resource WHERE {
  http://education.data.gov.uk/id/school/phase/nursery
  rdfs:member ?resource . }


You can also define membership in terms of an OWL class description then 
the final access becomes:


   SELECT ?resource WHERE { ?resource a http://myowlclass . }

So you are able to do the meat of the definition inline in the store or 
in some inference layer but are not *required* to do this.



  http://education.data.gov.uk/id/school
a api:List ;
rdfs:label Schools@en ;
api:itemType http://education.data.gov.uk/def/school/School .

or:

  http://education.data.gov.uk/id/school/administrativeDistrict/47UE
a api:List ;
rdfs:label Schools in Worcester@en ;
api:where ?item 
http://education.data.gov.uk/def/school/districtAdministrative 
http://statistics.data.gov.uk/id/local-authority-district/47UE .


or use something like SPIN [1] to express the query as RDF.


In our case we defined the queries via an API but could also store them 
in an RDF configuration file as strings in SPARQL syntax. Exploding the 
query into RDF structures wasn't needed but YMMV.



Or we could go one level higher and do something like:

  http://education.data.gov.uk/id/school/phase/*
a api:ListSet ;
rdfs:label Schools By Phase of Education@en ;
api:pattern 
http://education.data.gov.uk/id/school/phase/(nursery|primary|secondary)^^xsd:string 
;

api:map [
  api:regexGroup 1 ;
  api:property rdf:type ;
  api:enumeration [
  api:token nursery ;
  api:resource 
http://education.data.gov.uk/def/school/TypeOfEstablishment_EY_Setting ;

],
...
] .

It's not at all clear to me what the best approach it here. I tend to 
think that although a higher-level language might make things simpler in 
some ways, providing SPARQL queries gives the most flexibility. 


Agreed, that's the way we went and it seemed OK as far as we pushed it.


# Pagination and Sorting #

Lists are often going to be very long, so we'll need some way to support 
paging through the results that come back. It might also be useful to 
provide different sort orders. For example:


  
http://education.data.gov.uk/doc/school?sort=labelstartIndex=21itemsPerPage=20 



should give the second page of (20) results, in label order.

What I thought here is that we should assign *collections* URIs like:

  http://education.data.gov.uk/id/school

These are unordered and unpaginated. A request would result in a 303 
redirect to the document:


  http://education.data.gov.uk/doc/school

which is the same as:

  
http://education.data.gov.uk/doc/school?sort=labelstartIndex=1itemsPerPage=20 



(say) and is the first page of the (ordered, paginated) list. The RDF 
graph actually returned would be something like:


  http://education.data.gov.uk/id/school
rdfs:label Schools@en ;
foaf:isPrimaryTopicOf http://education.data.gov.uk/doc/school .

  http://education.data.gov.uk/doc/school
rdfs:label Schools (First 20, Ordered Alphabetically)@en ;
foaf:primaryTopic http://education.data.gov.uk/id/school ;
xhv:next http://education.data.gov.uk/doc/school?startIndex=21 ;
... other metadata ...
api:items (
  http://education.data.gov.uk/id/school/135160
  http://education.data.gov.uk/id/school/135441
  http://education.data.gov.uk/id/school/135868
  ...
) .

  http://education.data.gov.uk/id/school/135160
rdfs:label # New Comm Pri @ Allaway Avenue ;
... other triples ...

  ... statements about the other members of this list ...

Note here that the triples about the collection are curtailed to not 
include all the members of the collection (since to include them would 
kinda defeat the purpose of the pagination). If the collection were 
defined through a mechanism other than a list of members, then including 
that configuration information would be a good thing to do here

Re: Creating JSON from RDF

2009-12-14 Thread Dave Reynolds

Jeni Tennison wrote:

On 12 Dec 2009, at 22:27, Danny Ayers wrote:

I can't offer any real practical suggestions right away (a lot to
digest here!) but one question I think right away may some
significance: you want this to be friendly to normal developers - what
kind of things are they actually used to? Do you have any examples of
existing JSON serializations of relatively complex data structures -
something to give an idea of the target.

While it's a bit apples and oranges, there presumably are plenty of
systems now pushing out JSON rather than the XML they would a few
years ago - is there anything to be learnt from that scenario that can
be exploited for RDF?


Great idea to look at what currently exists. Let's see.

The flickr API [1] is notable in that the JSON is a direct mapping from 
the XML API. From what I can tell they don't even try to use values that 
are anything other than strings, and they have a moderately ugly 
_content property for when elements in the XML API have content 
(though it mostly uses attributes, from what I can tell).


The Twitter API [2] provides JSON and XML (Atom) formats. There are a 
whole bunch of different queries you can do, most of which contain 
nested objects etc. Interestingly, the JSON that you get back from a 
search does contain metadata about the search and a 'results' property 
that holds an array of the results. So that could be a relevant model 
for SPARQL results formats.


CouchDB [3] is a purely JSON API which has to be very generic (since 
CouchDB can be used to contain anything). It uses reserved property 
names like _id (this and _content in the flickr API make me think 
that a leading underscore is the expected way to create reserved 
property names).


Yahoo! [4] have a JSON API that is again based on an XML API with a 
straight-forward mapping from the XML to the JSON.


The Google Data Protocol [5] uses JSON that is generated from the 
equivalent Atom feed. Interestingly, they provide support for namespaces 
by having xmlns properties (with $ used instead of : in any 
qualified names). Unlike the other APIs, they do use namespaces, but 
only a handful. I strongly doubt that any developer using it would 
actually resolve openSearch$startIndex by looking at the 
xmlns$openSearch property.


Is that enough of an idea to be getting on with?


Probably, though one additional particularly relevant one is Freebase. 
That has a somewhat RDF-like abstract data model but data is both 
queried and returned as JSON. A query is a JSON template with nulls and 
blank structures where you want details filled in (plus meta properties 
for operations like sorting). Everything is identified by an id (a 
Freebase topic path, pretty much the trailing part of a URI) and/or a 
guid. Freebase is one reason I suggested id as a relatively familiar 
property label for identifiers.


Cheers,
Dave

[1] http://www.freebase.com/docs/data
[2] http://www.freebase.com/docs/data/first_query



Re: Creating JSON from RDF

2009-12-14 Thread Dave Reynolds

Jeni Tennison wrote:

It's worth noting that most of these APIs support a callback= parameter 
that makes the API return Javascript containing a function call rather 
than simply the JSON itself. I regard this as an unquestionably 
essential part of a JSON API, whether it uses RDF/JSON or RDFj or irJSON 
or whatever, in order to make the JSON usable cross-site.


+1

Dave



Re: Creating JSON from RDF

2009-12-14 Thread Dave Reynolds

Hi Jenni,

Jeni Tennison wrote:


On 13 Dec 2009, at 13:34, Dave Reynolds wrote:


I agree we want both graphs and SPARQL results but I think there is 
another third case - lists of described objects.


I absolutely agree with you that lists of described objects is an 
essential part of an API. In fact, I was going to (and will!) write a 
separate message about possible approaches for creating such lists.


It seemed to me that lists could be represented with RDF like:

  http://statistics.data.gov.uk/doc/local-authority?page=1
rdfs:label Local Authorities - Page 1 ;
xhv:next http://statistics.data.gov.uk/doc/local-authority?page=2 ;
...
api:contents (
  http://statistics.data.gov.uk/id/local-authority/00QA
  http://statistics.data.gov.uk/id/local-authority/00QB
  http://statistics.data.gov.uk/id/local-authority/45UB
  ...
)


This is just RDF, and as such any rules that we create about mapping RDF 
graphs to JSON could apply. (I agree that the list page should include 
extra information about the items in the list, but that seems to me to 
be a separable issue.)


Sure but there are some advantages to treating this ordered list of 
results as an API issue rather than a modelling issue.


I'll respond properly on your other thread.

One thing it makes me think is that perhaps JSON Schema [1] could form 
the basis of the mechanism for expressing any extra stuff that's 
required about the properties.


Interesting thought, I'll need to go learn more about JSON Schema first.

Note that the $ is taken from RDFj. I'm not convinced it's a good 
idea to use this symbol, rather than simply a property called about 
or this -- any opinions?


I'd prefer id (though about is OK), $ is too heavily overused in 
javascript libraries.


I agree. From the brief survey of JSON APIs that I did just now, it 
seems as though prefixing a reserved property name with a '_' is the 
usual thing. I'd suggest '_about' because it's similar to RDFa and 
because '_id', to me at least, implies a local identifier rather than a 
URI.


No objection to _about, as per separate thread it was Freebase 
especially that motivated the suggestion of id.


[On api:mapping usage]
Are you thinking of this as something the publisher provides or the 
API caller provides?


If the former, then OK but as I say I think a zero config set of 
default conventions is OK with the API to allow fine tuning.


I'm thinking of this as something that the publisher of the API creates 
(to describe/define the API). Note, though, that the publisher of the 
API might not be the publisher of the data, and that it could feasibly 
be possible for there to be a service that would allow clients to supply 
a configuration, point at a datastore, and have the API just work.


OK, agreed. My concern is that developers shouldn't have to wade through 
this mapping to understand what they are getting, unless they are 
already RDF heads and care about that aspect.


[On multi-valued properties]

I guess there are two choices if there was no specification:

  1. always give one value for the property; if there are several values 
in the graph, then provide the first
  2. give an array when there are multiple values and a singleton when 
there's only one


I did have another vague notion of providing two properties side by 
side, one singular and one plural, so you would have:


  {
nick: JeniT
  }

or

  {
nicks: [wilding, wilda]
  }

side by side in the same list of objects. But of course that would 
require configuration anyway (to provide pluralised versions of the 
label), so I'm not particularly taken with it.


It does concern me that if there are RDF graphs which contain 
descriptions of several resources of the same type, we might get into a 
situation where there are two resources for which the default behaviour 
would be different; we need to have a way of reconciling this (for 
example, if any of the resources in the graph have multiple values for a 
property, then it always uses an array).


Yes. With zero configuration there will always either be some 
inconsistency or you have to force the more general convention on 
people. I agree with Mark that developers can write code to adapt to the 
list/no-list case and with configuration we have the option to make this 
more consistent in places where this is a problem.


One possibility is a bootstrapping service where you give sample data 
and ontology, if available, and get back suggested mapping. That can do 
the scanning of data to guess at multi-valuedness once so you don't pay 
the cost of doing that in the live API.



[snip]
Language codes are effectively open ended. I can't necessarily predict 
what lang codes are going to be in my data and provide a property 
mapping for every single one.


I know they're *potentially* open-ended; I think in practice, for a 
single API, they are probably not. 


Depends on whether this is your own data or you are harvesting/receiving

Re: Creating JSON from RDF

2009-12-14 Thread Dave Reynolds

Mark Birbeck wrote:


On Sat, Dec 12, 2009 at 9:42 PM, Jeni Tennison j...@jenitennison.com wrote:



One thing that we want to do is provide JSON representations of both RDF
graphs and SPARQL results. I wanted to run some ideas past this group as to
how we might do that.


Great again. :)

In the work I've been doing, I've concluded that in JSON-world, an RDF
graph should be a JSON object (as explained in RDFj [1], and as you
seem to concur), but also that SPARQL queries should return RDFj
objects too.

In other words, after a lot of playing around I concluded that there
was nothing to be gained from differentiating between representations
of graphs, and the results of queries.


OK but the relationship between the JSON objects you get back from 
queries and the source RDF graph may be indirect. In a SPARQL query you 
sometimes pull pieces out of various depths of connected objects but 
from the JSON consumer you probably want to think of the results as an 
array of simple structures.



Note that the $ is taken from RDFj. I'm not convinced it's a good idea to
use this symbol, rather than simply a property called about or this --
any opinions?


I agree, and in my RDFj description I do say that since '$' is used in
a lot of Ajax libraries, I should find something else.

However, in my view, the 'something else' shouldn't look like a
predicate, so I don't think 'about' or 'this' (or 'id' as someone
suggests later in the thread), should be used. (Note also that 'id' is
used in a related but slightly different way by Dojo.)

Also, the underscore is generally related to bnodes, so it might be
confusing on quick reads through. (We have a JSON audience and an RDF
audience, and need to make design decisions with both in mind.)

I've often thought about the empty string, '@' and other
possibilities, but haven't had a chance to try them out. E.g., the
empty string would simply look like this:

  {
: http://www.w3.org/TR/rdf-syntax-grammar;,
  title: RDF/XML Syntax Specification (Revised),
  editor: {
name: Dave Beckett,
homepage: http://purl.org/net/dajobe/;
  }
  }

Since I always tend to indent the predicates in RDFj anyway, just to
draw attention to them, then the empty string is reasonably visible.
However, @ would be even more obvious:

  {
@: http://www.w3.org/TR/rdf-syntax-grammar;,
  title: RDF/XML Syntax Specification (Revised),
  editor: {
name: Dave Beckett,
homepage: http://purl.org/net/dajobe/;
  }
  }

Anyway, it shouldn't be that difficult to come up with something.


Naming discussions are often the hardest and can be non-terminating :) 
there is never only one acceptable answer, rarely a really good answer 
and everyone has opinions. Syntax and semantics are much easier than names.


As I've said, I like id as in freebase, I'd go along with _about 
(thought I take your point about bNode confusion) and I'd go along with 
@ (but it makes me think of pointers rather than ids).



So, the first piece of configuration that I think we need here is to map
properties on to short names...


That's what 'tokens' in the 'context' object do, in RDFj.


I think there's two separate things here. How the producer of the JSON 
maps their RDF to JSON names and then whether the consumer is able to 
inspect that mapping.


For the first of those, I think the proposal is that there be a set of 
default conventions (e.g. as in Exhibit JSON) plus an optional mapping 
spec which enables the names to be improved.


For the second part, I agree with you than a context object in the 
delivered JSON would be useful so that those consumers who care can see 
the mapping what was applied and even invert it.



However, in any particular graph, there may be properties that have been
given the same JSON name (or, even more probably, local name). We could
provide multiple alternative names that could be chosen between, but any
mapping to JSON is going to need to give consistent results across a given
dataset for people to rely on it as an API, and that means the mapping can't
be based on what's present in the data. We could do something with prefixes,
but I have a strong aversion to assuming global prefixes.


I'm not sure here whether the goal is to map /any/ API to RDF, but if
it is I think that's a separate problem to the 'JSON as RDF' question.


I think the problem is getting dev-friendly 'JSON out of RDF' rather 
than 'JSON as RDF', being able to invert that get back the RDF would be 
an added bonus rather than design requirement.



## Multi-valued Properties ##

First one first. It seems obvious that if you have a property with multiple
values, it should turn into a JSON array structure. For example:

 [] foaf:name Anna Wilder ;
   foaf:nick wilding, wilda ;
   foaf:homepage http://example.org/about .

should become something like:

 {
   name: Anna Wilder,
   nick: [ wilding, wilda ],
   homepage: http://example.org/about;
 }



Right. For those who haven't 

Re: Creating JSON from RDF

2009-12-13 Thread Dave Reynolds

Hi Jeni,

Jeni Tennison wrote:

As part of the linked data work the UK government is doing, we're 
looking at how to use the linked data that we have as the basis of APIs 
that are readily usable by developers who really don't want to learn 
about RDF or SPARQL.


Wow! Talk about timing. We are looking at exactly the same issue as part 
of the TSB work and were starting to look at JSON formats just this last 
couple of days. We should combine forces.


One thing that we want to do is provide JSON representations of both RDF 
graphs and SPARQL results. I wanted to run some ideas past this group as 
to how we might do that.


I agree we want both graphs and SPARQL results but I think there is 
another third case - lists of described objects.


This seems to have been a common pattern in the apps that I've worked 
on. You want to find all objects (resources in RDF speak) that match 
some criteria, with some ordering, and get back a list of them and their 
associated properties. This is like a SPARQL DESCRIBE operating on each 
of an ordered list of resources found by a SPARQL SELECT.


The point is that this is not a graph because the top level list needs 
to be ordered. It is not a SPARQL result set because you want the 
descriptions to include any of the properties that are present in the 
data (potentially included bNode closure) without having to know all 
those and spell them out in the query. But it is a natural thing to want 
to return from a REST API.


To put this in context, what I think we should aim for is a pure 
publishing format that is optimised for approachability for normal 
developers, *not* an interchange format. RDF/JSON [1] and the SPARQL 
results JSON format [2] aren't entirely satisfactory as far as I'm 
concerned because of the way the objects of statements are represented 
as JSON objects rather than as simple values. I still think we should 
produce them (to wean people on to, and for those using more generic 
tools), but I'd like to think about producing something that is a bit 
more immediately approachable too.


RDFj [3] is closer to what I think is needed here. However, I don't 
think there's a need for setting 'context' given I'm not aiming for an 
interchange format, there are no clear rules about how to generate it 
from an arbitrary graph (basically there can't be without some 
additional configuration) and it's not clear how to deal with datatypes 
or languages.


WRT 'context' you might not need it but it I don't think it is harmful. 
 I think if we said to developers that there is some outer wrapper like:


{
   format : RDF-JSON,
   version : 0.1,
   mapping :  ... magic stuff ...
   data : ... the bit you care about ...
}

The developers would be quite happy doing that one dereference and 
ignore the mapping stuff but it might allow inversion back to RDF for 
those few who do care, or come to care.


I suppose my first question is whether there are any other JSON-based 
formats that we should be aware of, that we could use or borrow ideas from?


The one that most intrigued me as a possible starting point was the 
Simile Exhibit JSON format [1]. It is developer friendly in much the way 
that you talk about but it has the advantage of zero configuration, some 
measure of invertability, has an online translator [2] and is supported 
by the RPI Sparql proxy [3].


I've some reservations about standardizing on it as is:
 - lack of documentation of the mapping
 - some inconsistencies in how references between resources are encoded 
(at least judging by the output of Babel[2] on test cases)
 - handling of bNodes - I'd rather single referenced bNodes were 
serialized as nested structures


[There was another format we used in a project in my previous existence 
but I'm not sure if that was made public anywhere, will check.]


Assuming there aren't, I wanted to discuss what generic rules we might 
use, where configuration is necessary and how the configuration might be 
done.


One starting assumption to call out: I'd like to aim for a zero 
configuration option and that explicit configuration is only used to 
help tidy things up but isn't required to get started.



# RDF Graphs #

Let's take as an example:

  http://www.w3.org/TR/rdf-syntax-grammar
dc:title RDF/XML Syntax Specification (Revised) ;
ex:editor [
  ex:fullName Dave Beckett ;
  ex:homePage http://purl.org/net/dajobe/ ;
] .

In JSON, I think we'd like to create something like:

  {
$: http://www.w3.org/TR/rdf-syntax-grammar;,
title: RDF/XML Syntax Specification (Revised),
editor: {
  name: Dave Beckett,
  homepage: http://purl.org/net/dajobe/;
}
  }


+1 on style

In terms of details I was thinking of following the Simile convention on 
short form naming that, in the absence of clashes, use the rdfs:label 
falling back to the localname, as the basis for the shortened property 
names. So knowing nothing else the bNode would be:


  ...
editor: {
   fullName: 

Re: Dons flame resistant (3 hours) interface about Linked Data URIs

2009-07-13 Thread Dave Reynolds

Steve Harris wrote:

On 10 Jul 2009, at 11:00, Toby Inkster wrote:


On Fri, 2009-07-10 at 10:40 +0100, Steve Harris wrote:

Personally I think that RDF/XML doesn't help, it's too hard to write
by hand.


MicroTurtle, the sloppy RDF format:

http://buzzword.org.uk/2009/microturtle/spec


That's very interesting. I like it, but I'm not sure that it's 
necessarily what I would ideally like if I were coming to RDF afresh. It 
looks like the perl of RDF syntaxes :) Which is good for some people, 
but not others.


Something like NTriples + UTF-8 + @prefix could be an answer for people 
new to RDF. One of the problems is the various triple shortcut syntaxes 
we use. Either the stacked syntax of RDF/XML, or the punctuation of Turtle.


For anyone who's about to say that Turtle = ntriples + UTF-8 + @prefix - 
it doesn't help. The vast Majority of examples you see online use at 
least ; and probably [] and , too, which makes it very hard to follow. 
At least in my experience of introducing developers to RDF.


FWIW my experience with technically savvy but non-semweb people is that 
Turtle is a not only a low barrier it is a selling feature in a way that 
abbreviated n-triples isn't.  I've people who are using RDF solely 
because they find Turtle a more convenient, compact notation for writing 
down their data than any of the (mostly XML based) alternatives they've 
tried.


The fact that you can use a similar notation in queries has helped too.

Dave
--
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England