from:"Ian Davis"

Re: Advocacy URL for publishing data with an explicit license

2011-10-24 Thread Ian Davis

Talis has written about this issue or encouraged others to write for quite a
while. Here are a few links. There are probably others I have forgotten
about.

http://blogs.talis.com/nodalities/2010/02/sharing-data-on-the-web.php

http://blogs.talis.com/nodalities/2009/07/linked-data-public-domain.php

http://blogs.talis.com/nodalities/2007/07/open_data_licensing_an_unnatur.php

HTH

Ian


On Monday, October 24, 2011, Richard Cyganiak rich...@cyganiak.de wrote:
 Dear list,

 We all know that data publishers *should* publish their data along with an
explicit license that explains what kind of re-use is allowed.

 Can anyone suggest a good reference/link/URL that makes this case? A blog
post or advocacy site or similar?

 Bonus points if it has specific recommendations for RDF.

 My preferred candidate so far is this – but it's not particularly strong
on the “why”:
 http://www.w3.org/TR/void/#license

 Thanks,
 Richard

Re: Linked Data visualisation of Indices of Multiple Deprivation

2011-09-08 Thread Ian Davis

Very nice!
On 8 Sep 2011 17:58, Bill Roberts b...@swirrl.com wrote:
 The list might be interested in a prototype application built by Steve
Peters that visualises and links up data on:

 - the English Indices of Multiple Deprivation from
http://opendatacommunities.org (which I was involved with)
 - schools (from the data.gov.uk Edubase dataset)
 - councillors (from OpenlyLocal)
 - MPs (from theyworkforyou)

 helped also by other APIs including Mapit, Geonames, Google maps...

 It's one of the nicest examples I've seen of this kind of mashup and shows
off what it means to be truly linked. It uses capital letters Linked Data
where appropriate and available, but also pulls in stuff from all kinds of
other APIs too.

 Blog post here:
http://openviz.wordpress.com/2011/09/08/indices-of-deprivation-linked-data-prototype/
 App itself here: http://dclgexamples.mywebcommunity.org/imd_demo_v7.htm

 Bill Roberts

Re: Squaring the HTTP-range-14 circle

2011-06-17 Thread Ian Davis

On Fri, Jun 17, 2011 at 12:35 PM, Dave Reynolds
dave.e.reyno...@gmail.com wrote:
 On Thu, 2011-06-16 at 21:22 -0400, Tim Berners-Lee wrote:

 On 2011-06 -16, at 16:41, Ian Davis wrote:

  The problem here is that there are so few things that people want to
  say about web pages compared with the multitude of things they want to
  say about every other type of thing in existence.

 Well, that is a wonderful new thing.  For a long while it was difficult to
 put data on the web, while there is quite a lot of metadata.
 Wonderful idea that the semantic web may be beating the document
 web hands down but that's not totally clear that we should trash the
 use of URIs for use to refer to documents as do in the document web.

 I'm sure Ian wasn't claiming the data web is beating the document web
 and equally sure that you don't really think he was :)

Yes, absolutely.


 FWIW my experience is also that most of the data that people want to
 publish *in RDF* is about things rather than web pages. Clearly there
 *are* good use cases for capturing web page metadata in RDF but I've not
 seen that many in-the-wild cases where people wanted to publish data
 about *both* the web page and the thing.

 That's why Ian's Back to Basics suggestion works for me [as a fall
 back from just use #]. My interpretation is that, unlike most of this
 thread, it wasn't saying use URIs ambiguously but saying the
 interpretation of the URI is up to the publisher and is discovered from
 the data not from the protocol response, it is legitimate to use a
 http-no-# URI to denote a thing if that is what you really want to do.


Yes, that's exactly what I am saying.


 Thus if I want to publish a table of e.g. population statistics at
 http://foobar.gov.uk/datasets/population then I can do so and use that
 URI within the RDF data as denoting the data set. As publisher I'm
 saying this is a qb:DataSet not a web page, anything that looks like a
 web page when you point a browser at it is just a rendering related to
 that data and that rendering isn't being given a separate URI so you can
 talk about it, sorry about that.

 If you use HTTP 200 for something different, then
 you break my ability to look at a page, review it, and then
 express my review in RDF,  using the page's URI as the identifier.

 Not quite. It is saying that you can't give a review for my
 http://foobar.gov.uk/datasets/population web page because the RDF
 returned by the URI says it denotes a dataset not the web page. You can
 still review the dataset itself. You can review other web pages which
 don't return RDF data saying they are something other than a web page.

 [As an aside, I would claim that most reviews are in fact about things -
 restaurants, books, music - not about the web pages.]


Quite. When a facebook user clicks the Like button on an IMDB page
they are expressing an opinion about the movie, not the page.

Ian

Re: Squaring the HTTP-range-14 circle

2011-06-17 Thread Ian Davis

On Fri, Jun 17, 2011 at 2:04 PM, Tim Berners-Lee ti...@w3.org wrote:

 On 2011-06 -17, at 08:51, Ian Davis wrote:

 If you use HTTP 200 for something different, then
 you break my ability to look at a page, review it, and then
 express my review in RDF,  using the page's URI as the identifier.

 Not quite. It is saying that you can't give a review for my
 http://foobar.gov.uk/datasets/population web page because the RDF
 returned by the URI says it denotes a dataset not the web page. You can
 still review the dataset itself. You can review other web pages which
 don't return RDF data saying they are something other than a web page.

 [As an aside, I would claim that most reviews are in fact about things -
 restaurants, books, music - not about the web pages.]


 Quite. When a facebook user clicks the Like button on an IMDB page
 they are expressing an opinion about the movie, not the page.

 BUT when the click a Like button on a blog they are expressing they like the
 blog, not the movie it is about.

 AND when they click like on a facebook comment they are
 saying they like the comment not the thing it is commenting on.

 And on Amazon people say I found this review useful to
 like the review on the product being reviewed, separately from
 rating the product.
 So there is a lot of use out there which involves people expressing
 stuff in general about the message not its subject.

Sure. All these use cases stand and can co-exist. I can look at the
data in any of those responses, or data I glean from elsewhere, to
figure out if the URI I'm accessing refers to the content I received
or the subject of that content. That model works for any protocol BTW.


 I am really not sure that I want to give up the ability in my browser
 to bookmark a page about something -- the IMDB page a
 about a movie, rather than the movie itself.


OK, we differ here then. I would prefer to bookmark the movie because
that's what I'm interested in. The page will change over the years but
the movie will still persist. Today you have no choice because your
conceptual model does not give a URI to the movie and doesn't see the
need to generate 2 URIs.


 When the cost os just fixing Microdata syntax to make it easy to
 say things about the subject of a page.

i don't think this has anything to do with microdata.

Ian

Re: Squaring the HTTP-range-14 circle

2011-06-17 Thread Ian Davis

On Fri, Jun 17, 2011 at 2:04 PM, Tim Berners-Lee ti...@w3.org wrote:
 Not quite. It is saying that you can't give a review for my
 http://foobar.gov.uk/datasets/population web page because the RDF
 returned by the URI says it denotes a dataset not the web page. You can
 still review the dataset itself. You can review other web pages which
 don't return RDF data saying they are something other than a web page.

 [As an aside, I would claim that most reviews are in fact about things -
 restaurants, books, music - not about the web pages.]


 Quite. When a facebook user clicks the Like button on an IMDB page
 they are expressing an opinion about the movie, not the page.

 BUT when the click a Like button on a blog they are expressing they like the
 blog, not the movie it is about.

 AND when they click like on a facebook comment they are
 saying they like the comment not the thing it is commenting on.

 And on Amazon people say I found this review useful to
 like the review on the product being reviewed, separately from
 rating the product.
 So there is a lot of use out there which involves people expressing
 stuff in general about the message not its subject.

As an additional point, a review _is_ a seperate thing, it's not a web
page. It is often contained within a webpage. It seems you are
conflating the two here. Reviews and comments can be and often are
syndicated across multiple sites so clearly any liking of the review
needs to flow with it.

Ian

Re: Squaring the HTTP-range-14 circle

2011-06-17 Thread Ian Davis

Small typo changed the meaning of what I was saying:

On Fri, Jun 17, 2011 at 2:18 PM, Ian Davis li...@iandavis.com wrote:
 OK, we differ here then. I would prefer to bookmark the movie because
 that's what I'm interested in. The page will change over the years but
 the movie will still persist. Today you have no choice because your
 conceptual model does not give a URI to the movie and doesn't see the
 need to generate 2 URIs.

But I meant to write:

Today you have no choice because your conceptual model does not give a
URI to the movie and [the publisher] doesn't see the need to generate
2 URIs.

Of course I recognise your conceptual model sees the need for multiple
URIs... :)

Ian

Re: Squaring the HTTP-range-14 circle

2011-06-16 Thread Ian Davis

Tim,

On Thu, Jun 16, 2011 at 6:04 PM, Tim Berners-Lee ti...@w3.org wrote:

 I don't think 303 is a quick and dirty hack.
 It does mean a large extension of HTTP to be uses with non-documents.
 It does have efficiency problems.
 It is an architectural extension to the web architecture.


We have had many years for this architectural extension to be adopted
and many of us producing linked data have been diligent in supporting,
promoting and educating people about it. Even I, with my many many
attempts to get this decision reconsidered, have promoted the W3C
consensus. Conversely, many more people have studied this extension
and rejected it. Companies such as Google, Facebook, Microsoft and
Yahoo, who are all W3C members and can influence these decisions
through formal channels if they wish, have looked at the httpRange
decsion and decided it doesn't work for them. Instead they have chosen
different approaches that require more effort to consume but lower the
conceptual barrier for publishers. However, they are convinced of the
need for URIs to identify things that are not just web pages which is
a huge positive.

These companies collectively account for a very large proportion of
web traffic and activity. I think just saying that they're wrong and
should change their approach is simply being dogmatic. They are
telling us that we are wrong. We should listen to them.


 If you want to give yourself the luxury of being able to refer to the subject 
 of a webpage, without having to add anthing to disambiguate it from the web 
 page, then for the sake of your system, so you can use the billion web pages 
 for your purposes, then you now stop other like me from using semantic web 
 systems to refer to those web pages, or in fact to the other hundred million 
 web pages either.


The problem here is that there are so few things that people want to
say about web pages compared with the multitude of things they want to
say about every other type of thing in existence. Yet the httpRange
decision makes the web page a privileged component. I understand why
that might have seemed a useful decision, after all this is the web we
are talking about, but it has turned out not to be. The web page is
only the medium for conveying information about the things we are
really interested in.

The analogy is metadata about a book. Very little of it is about the
physical book, i.e. the medium. Perhaps you would want to record its
dimensions, mass, colour, binding or construction. There are many many
more things you would want to record about the book's content, themes,
people and places mentioned, author etc.

 Maybe you should an efficient way of doing what you want without destroying 
 the system (which you as well have done so much to build)


I think this is unreasonably strong. Nothing is being destroyed.
Nothing has broken.

A few days after I wrote this post
(http://blog.iandavis.com/2010/12/06/back-to-basics/) I changed one of
the many linked datasets I maintain to stop using 303 redirects over a
few million resources. No-one has noticed yet. Nothing has broken.

Ian

Re: Quick reality check please

2011-04-08 Thread Ian Davis

On Fri, Apr 8, 2011 at 1:21 PM, Richard Cyganiak rich...@cyganiak.de wrote:

 With the second issue it's not so clear what to do about it. It's a question 
 of good practice, and I'm not aware of any document where that recommendation 
 could be easily added.

Maybe it could be written up as a pattern for the Linked Data Patterns book?

http://patterns.dataincubator.org/

Ian

Re: Exciting changes at Data.Southampton.ac.uk!

2011-04-01 Thread Ian Davis

Christopher,

I really don't see why I should have to reengineer my entire toolchain
simply to consume your proprietary format. It is well known that the
standard for information interchange is the Microsoft Word 97 document
format which is easily read by every popular computing package. I for one
will not be submitting to the opression of PDF.

Ian
On 1 Apr 2011 08:29, Christopher Gutteridge c...@ecs.soton.ac.uk wrote:
 After some heated debate after the backlash against me for my recent
 comments about PDF, I've been forced to shift to recommending PDF as the
 preferred format for the data.southampton.ac.uk site, both for
 publishing and importing data.

 There are some issues with this and I know not every one will be happy
 with the decision; it wasn't easy to make... but on reflection, however,
 it's the right one. It is much easier for non programmers (the majority
 of people) to work with PDF documents and they are supported by pretty
 much every platform you can think of with a choice of tools and the
 benefit of familiarity.

 We've provided a wrapper around 4store to make PDF the default output
mode:

http://sparql.data.southampton.ac.uk/?query=PREFIX+soton%3A+%3Chttp%3A%2F%2Fid.southampton.ac.uk%2Fns%2F%3E%0D%0APREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0D%0APREFIX+skos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23%3E%0D%0APREFIX+geo%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2003%2F01%2Fgeo%2Fwgs84_pos%23%3E%0D%0APREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0D%0APREFIX+org%3A+%3Chttp%3A%2F%2Fwww.w3.org%2Fns%2Forg%23%3E%0D%0APREFIX+spacerel%3A+%3Chttp%3A%2F%2Fdata.ordnancesurvey.co.uk%2Fontology%2Fspatialrelations%2F%3E%0D%0APREFIX+ep%3A+%3Chttp%3A%2F%2Feprints.org%2Fontology%2F%3E%0D%0APREFIX+dct%3A+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2F%3E%0D%0APREFIX+bibo%3A+%3Chttp%3A%2F%2Fpurl.org%2Fontology%2Fbibo%2F%3E%0D%0APREFIX+owl%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23%3E%0D%0A%0D%0ASELECT+%3Fs+WHERE+{%0D%0A%3Fs+%3Fp+%3Fo+.%0D%0A}+LIMIT+10output=pdfjsonp=#results_table

 And most information URIs can now be resolved to PDF, but we are
 sticking to HTML as the default (for now)
 http://data.southampton.ac.uk/products-and-services/FreshFruit.pdf

 The full details and rationale are on our data blog

http://blogs.ecs.soton.ac.uk/data/2011/04/01/pdf-selected-as-interchange-format/E


 --
 Christopher Gutteridge -- http://id.ecs.soton.ac.uk/person/1248

 You should read the ECS Web Team blog:
http://blogs.ecs.soton.ac.uk/webteam/

Re: Exciting changes at Data.Southampton.ac.uk!

2011-04-01 Thread Ian Davis

On Fri, Apr 1, 2011 at 10:14 AM, Richard Cyganiak rich...@cyganiak.de wrote:
 Taking advantage of one of PDF's many advantages, we plan to present the 
 first printed and bound version of the complete Linked PDF Data at next 
 years' LDOW workshop. Order your copy now! Shipping fees not included.


See, this is where we differ. Your radical ideas about enabling print
in the PDFs just won't get traction in real businesses.

Ian

Re: data schema / vocabulary / ontology / repositories

2011-03-13 Thread Ian Davis

Please try http://schemapedia.com/ and see if it meets your needs.
On 13 Mar 2011 16:23, Dieter Fensel dieter.fen...@sti2.at wrote:
 Dear all,

 for a number of projects I was searching for vocabularies/Ontologies
 to describe linked data. Could you please recommend me places
 where to look for them? I failed to find a convenient entrance point for
such
 kind of information. I only found some scattered information here and
 there?

 Thanks,

 Dieter
 --
 Dieter Fensel
 Director STI Innsbruck, University of Innsbruck, Austria
 http://www.sti-innsbruck.at/
 phone: +43-512-507-6488/5, fax: +43-512-507-9872

Re: The truth about SPARQL Endpoint availability

2011-03-06 Thread Ian Davis

Is the number of triples that important? With all respect to the
people on this list, I think there's a tendency to obsess over triple
counts. Aren't we past that bootstrap phase of being awed when we see
millions of triples being produced?  I thought we'd moved towards
being more focussed on quality and utility of data than sheer numbers?

Besides, for me the most interesting datasets are those that are
continually changing as they reflect the real world and I'd like to
see us work towards metrics for freshness and coverage.


On Sun, Mar 6, 2011 at 11:20 AM, Tim Berners-Lee ti...@w3.org wrote:
 Maybe the count of triples should be special-cased in the sparql server code,
 spotted on input and the store size returned.
 if it is reasonable for the endpoint to keep track of the size of its store.
 (Do they anyway?)

 Tim

 On 2011-03 -05, at 11:58, Bill Roberts wrote:

 Thanks Hugh - as someone running a couple of SPARQL endpoints, I'd certainly 
 prefer if people don't run a global count too often (or at all). It is 
 indeed something that makes typical SPARQL implementations work very hard.

 But it's a good reminder we should provide an alternative and i'll look into 
 providing triple counts in voiD.

 Bill


 On 5 Mar 2011, at 15:14, Hugh Glaser wrote:

 Hi,
 On 5 Mar 2011, at 14:22, Andrea Splendiani wrote:

 Hi,

 I think it depends on the store, I've tried some (from the endpoint list) 
 and some returns a answer pretty quickly. Some doesn't and some doesn't 
 support count.
 However, one could have this information only for the stores that answers 
 the count query, no need to try all time.
 I am happy for a store implementor or owner to disagree, but I find it very 
 unlikely that the owner of a store with a decent chunk of data ( 1M 
 triples, say) would be happy for someone to keep issuing such a query, even 
 if they did decide to give enough resources to execute it.
 I would quickly blacklist such a site.

 VoID:
 is this a good query:
 select * where {?s http://rdfs.org/ns/void#numberOfTriples ?o }

 I'm no SPARQL or voiD guru, but I think you need a bit more wrapping in the 
 scovo stuff, so more like:

 SELECT DISTINCT ?endpoint ?uri ?triples ?uris WHERE
          { ?ds a void:Dataset .
            ?ds void:sparqlEndpoint ?uri .
            ?ds rdfs:label ?endpoint .
            ?ds void:statItem [ scovo:dimension void:numberOfTriples ; 
 rdf:value  ?triples ] .
         }

 Try it at
 http://kwijibo.talis.com/voiD/
 or
 http://void.rkbexplorer.com/

 I guess Pierre-Yves might like to enhance his page by querying a voiD store 
 to also give basic stats.
 Or someone might like to do a store reporter that uses (a) voiD endpoint(s) 
 plus Pierre-Yves's data (he has a SPARQL endpoint), to do so.
 And maybe the CKAN endpoint would have extra useful data as well.
 A real Semantic Web application that queried more than one SPARQL endpoint 
 - now that would be a novelty!
 Fancy the challenge, it is the weekend?! :-)

 ciao
 Hugh


 it doesn't seem viable if so.

 ciao,
 Andrea


 Il giorno 05/mar/2011, alle ore 13.49, Hugh Glaser ha scritto:

 NIce idea, but,... :-)

 SELECT (count(*) as ?c) WHERE {?s ?p ?o}

 is a pretty anti-social thing to do to a store.
 At best, a store of any size will spend a while thinking, and then quite 
 rightly decide they have burnt enough resources, and return some sort of 
 error.

 For a properly maintained site, of course, the VoiD description will give 
 lots of similar information.
 Best
 Hugh

 On 5 Mar 2011, at 13:06, Andrea Splendiani wrote:

 Hi, very nice!
 I have a small suggestion:

 why don't you ask count(*) where {?s ?p ?o} to the endpoint ?
 Or ask for the number of graphs ?
 Both information, number of triples and number of graphs, if logged and 
 compared over time, can give a practical view of the liveliness of the 
 content of the endpoint.

 best,
 Andrea Splendiani


 Il giorno 28/feb/2011, alle ore 18.55, Pierre-Yves Vandenbussche ha 
 scritto:

 Hello all,

 you have already encountered problems of SPARQL endpoint accessibility ?
 you feel frustrated they are never available when you need them?
 you develop an application using these services but wonder if it is 
 reliable?

 Here is a tool [1] that allows you to know public SPARQL endpoints 
 availability and monitor them in the last hours/days.
 Stay informed of a particular (or all) endpoint status changes through 
 RSS feeds.
 All availability information generated by this tool is accessible 
 through a SPARQL endpoint.

 This tool fetches public SPARQL endpoints from CKAN  open data. From 
 this list, it runs tests every hour for availability.

 [1] http://labs.mondeca.com/sparqlEndpointsStatus/index.html
 [2] http://ckan.net/

 Pierre-Yves Vandenbussche.

 Andrea Splendiani
 Senior Bioinformatics Scientist
 Centre for Mathematical and Computational Biology
 +44(0)1582 763133 ext 2004
 andrea.splendi...@bbsrc.ac.uk




 --
 Hugh Glaser,
           Intelligence, Agents, Multimedia

Re: Google's structured seach talk / Google squared UI

2011-02-11 Thread Ian Davis

Hi,

I did something very similar to Google Squared in small php script a
couple of years ago:

http://iandavis.com/2009/lodgrid/?store=spacequery=jupitercolumns=6

It uses linked data held in the Talis Platform and the platform's full
text search service.

More examples linked from the main page:

http://iandavis.com/2009/lodgrid/

Ian

On Fri, Feb 11, 2011 at 10:23 AM, Daniel O'Connor
daniel.ocon...@gmail.com wrote:
 Hi all,
 This talk might have been seen by some of you; but was certainly new to me:
 http://www.youtube.com/watch?v=5lCSDOuqv1Afeature=autoshare
 Much of this is an exploration of how google is making use of freebase's
 underlying linked data to better understand what they are crawling -
 deriving what something is by examining its attributes; and automatically
 creating something like linked data from it.

 Additionally; it talks about Google squared - this tool appears to be
 heavily powered by freebase data; as well as derived data from the web. I
 was fairly impressed by the mix of understanding a user query and rendering
 results as actual entities (one of the few non-facet based UIs I have seen).
 For instance: territorial authorities in new zealand
 http://www.google.com/squared/search?q=territorial+authorities+in+new+zealand
 Whilst this is not using the typical linked data technology stack of RDF,
 SPARQL, open licenced data, etc; it certainly shows you what can be done
 with data in a graph structure; plus a UI which is a cross between a
 spreadsheet and a search result.

Re: Google's structured seach talk / Google squared UI

2011-02-11 Thread Ian Davis

Give me a break, it was only an hour or so's work! :)

Seriously, what you suggest is possible with a bit more effort.

On Friday, February 11, 2011, Juan Sequeda juanfeder...@gmail.com wrote:
 Nice!
 But unfortunately I have to choose a platform store. Shouldn't I be able
to search for jupiter and return results from nasa and dbpedia?Juan
Sequeda
 +1-575-SEQ-UEDA
 www.juansequeda.com


 On Fri, Feb 11, 2011 at 5:03 AM, Ian Davis li...@iandavis.com wrote:


 Hi,

 I did something very similar to Google Squared in small php script a
 couple of years ago:

 http://iandavis.com/2009/lodgrid/?store=spacequery=jupitercolumns=6

 It uses linked data held in the Talis Platform and the platform's full
 text search service.

 More examples linked from the main page:

 http://iandavis.com/2009/lodgrid/

 Ian

 On Fri, Feb 11, 2011 at 10:23 AM, Daniel O'Connor
 daniel.ocon...@gmail.com wrote:
 Hi all,
 This talk might have been seen by some of you; but was certainly new to
me:
 The Structured Search Engine 
http://www.youtube.com/watch?v=5lCSDOuqv1Afeature=autoshare
 Much of this is an exploration of how google is making use of freebase's
 underlying linked data to better understand what they are crawling -
 deriving what something is by examining its attributes; and automatically
 creating something like linked data from it.

 Additionally; it talks about Google squared - this tool appears to be
 heavily powered by freebase data; as well as derived data from the web. I
 was fairly impressed by the mix of understanding a user query and
rendering
 results as actual entities (one of the few non-facet based UIs I have
seen).
 For instance: territorial authorities in new zealand

http://www.google.com/squared/search?q=territorial+authorities+in+new+zealand
 Whilst this is not using the typical linked data technology stack of RDF,
 SPARQL, open licenced data, etc; it certainly shows you what can be done
 with data in a graph structure; plus a UI which is a cross between a
 spreadsheet and a search result.

Re: What would break, a question for implementors? (was Re: Is 303 really necessary?)

2010-11-09 Thread Ian Davis

On Tue, Nov 9, 2010 at 11:23 AM, Nathan nat...@webr3.org wrote:
 Pete Johnston wrote:

 This document mentions the following class

 It's all very simple really, when you remove all the conflated terms.

I am not conflating terms and nor is my example, but I think you are (see below)


 What is this:

 ?xml version=1.0?
 rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
  xmlns:foaf=http://xmlns.com/foaf/0.1/;
  xmlns:rdfs=http://www.w3.org/2000/01/rdf-schema#;
  xmlns:wdrs=http://www.w3.org/2007/05/powder-s#;
  xmlns:dbp=http://dbpedia.org/resource/;
  

  dbp:Toucan rdf:about=http://iandavis.com/2010/303/toucan;
    rdfs:labelA Toucan/rdfs:label
    foaf:depiction
 rdf:resource=http://upload.wikimedia.org/wikipedia/commons/thumb/6/6d/Pteroglossus-torquatus-001.jpg/250px-Pteroglossus-torquatus-001.jpg;
 /
    rdfs:commentThis resource is an individual toucan that happens to live
 in southern mexico./rdfs:comment
    wdrs:describedby
 rdf:resource=http://iandavis.com/2010/303/toucan.rdf/
  /dbp:Toucan

  foaf:Document rdf:about=http://iandavis.com/2010/303/toucan.rdf;
    rdfs:labelA Description of a Toucan/rdfs:label
    rdfs:commentThis document is a description of the toucan
 resource./rdfs:comment
  /foaf:Document

 /rdf:RDF

 http://iandavis.com/2010/303/toucan is simply another name for whatever
 the above is.

Nope. It's not at all. That text you include is the entity sent when
you issue a GET to the URI. Entity bodies aren't usually named on the
web. It's also a representation of
http://iandavis.com/2010/303/toucan.rdf

You are conflating the resource with the content of an HTTP message
sent to your computer.

You could interpret the tabulator property as meaning the entity
returned when you perform a GET on the URI contains the following
class



 Hints:
  - it's not a resource
It has a URI http://iandavis.com/2010/303/toucan.rdf, anything
identified by a URI is a resource.

  - it's not a document
I think it is

  - it's not an rdf document
I think it is


  - it's not a toucan

Agree. That text is not a toucan.



 Best,

 Nathan


Ian

Publishing Linked Data without Redirects

2010-11-08 Thread Ian Davis

I wrote up a summary of the current thinking on using 200 instead of
303 to serve up Linked Data:

http://iand.posterous.com/a-guide-to-publishing-linked-data-without-red

The key part is:

When your webserver receives a GET request to your thing’s URI you may
respond with a 200 response code and include the content of the
description document in the response provided that you:

   1. include the URI of the description document in a
content-location header, and
   2. ensure the body of the response is the same as the body obtained
by performing a GET on the description document’s URI, and
   3. include a triple in the body of the response whose subject is
the URI of your thing, predicate is
http://www.w3.org/2007/05/powder-s#describedby and object is the URI
of your description document


But read the whole post for an example, some theory background and some FAQ.

Cheers,

Ian

Re: What would break, a question for implementors? (was Re: Is 303 really necessary?)

2010-11-08 Thread Ian Davis

On Fri, Nov 5, 2010 at 9:53 AM, Leigh Dodds leigh.do...@talis.com wrote:
 So here's a couple of questions for those of you on the list who have
 implemented Linked Data tools, applications, services, etc:

 * Do you rely on or require HTTP 303 redirects in your application? Or
 does your app just follow the redirect?
 * Would your application tool/service/etc break or generic inaccurate
 data if Ian's pattern was used to publish Linked Data.


I used Denny Vrandečić's browser tool to test several Linked Data
browsers including Tabulator.

http://browse.semanticweb.org/?uri=http%3A%2F%2Fiandavis.com%2F2010%2F303%2Ftoucandays=7

Non of these showed any confusion between the toucan and its
description, nor did that throw warnings or errors about the lack of
303 or in fact make any reference to it (tabulator includes the
response as RDF but does not infer that the 200 response implies a
type of information resource, which I had assumed it would)


Cheers,

Ian

Re: Hash vs Slash in relation to the 303 vs 200 debate (was: Is 303 really necessary - demo)

2010-11-07 Thread Ian Davis

On Sat, Nov 6, 2010 at 11:31 PM, Toby Inkster t...@g5n.co.uk wrote:
 Not necessarily. If you take your ex:isDescribedBy predicate and add
 that to a triple store where the non-Information-Resource resources are
 identified using hash URIs, then the SPARQL query is just:

        DESCRIBE uri ?res
        WHERE { ?res ex:isDescribedBy uri . }

 which needn't be very slow.

I've done this myself but using foaf:primaryTopic and foaf:topic to
link a document URI to all the resources that are needed to render it.



 The other downside of fragments is you can't say it exists but I have
 no description of it.

 #foo a rdfs:Resource .

In which case you do have a description of it :) But point taken, this
tautology would be enough.

Cheers,

Ian

Re: 200 OK with Content-Location might work

2010-11-07 Thread Ian Davis

On Sun, Nov 7, 2010 at 7:35 PM, Phil Archer ph...@w3.org wrote:
 I share John's unease here. And I remain uneasy about the 200 C-L solution.

 I know I sound like a fundamentalist in a discussion where we're trying to
 find a practical, workable solution, but is a description of a toucan a
 representation of a toucan? IMO, it's not. Sure, one can imagine an HTTP
 response returning a very rich data stream that conveys the entire
 experience of having a toucan on your desk - but the toucan ain't actually
 there.

The content-location header says that the entity being sent in the
response is not a representation of the resource.

I don't want to get into a heavy what is a representation really
debate because those have been done to minute detail over on the TAG
list for many years. Suffice to say that http://www.google.com/ has a
representation that is not the entire experience of the google website
get that URI denotes the google website for the majoriity of people.

 I've been toying with the idea of including a substitution rule in a 200
 header.

I'd prefer not to invent anything new, e.g. new headers or status
codes. I'm just looking to simplify an existing set of patterns.

 My worry is that any 200-based solution is going to be poorly implemented in
 the real world by both browsers and LOD publishers (Talis excepted of
 course!) so that IRs and NIRs will be indistinguishable 'in the wild'.

My proposal keeps the two separate with distinct URIs. With clear
guides, education and testing tools we can encourage people to do the
right thing, just like we currently do with any standard.

However, philosophically I wonder whether there are any practical
consequences of them being indistinguishable. When i read my email in
gmail it is hard to separate the email from the webpage allowing me to
read the email yet it still works.


 303 works already, and that is still the one that feels right to me. I'm
 happy that the discussion here is centred on adding a new method cf.
 replacing 303, especially as the HTTP-Bis group seems to have made its use
 for LOD and explicit part of the definition.

It would still be available. My proposal is to provide a streamlined
alternative, one that is more in line with what millions of webmasters
are doing already.


Ian

Re: 200 OK with Content-Location might work

2010-11-06 Thread Ian Davis

On Fri, Nov 5, 2010 at 4:55 PM, Nathan nat...@webr3.org wrote:
 Mike Kelly wrote:

 http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-12#page-14

 snipped and fuller version inserted:

   4.  If the response has a Content-Location header field, and that URI
       is not the same as the effective request URI, then the response
       asserts that its payload is a representation of the resource
       identified by the Content-Location URI.  However, such an
       assertion cannot be trusted unless it can be verified by other
       means (not defined by HTTP).

 If a client wants to make a statement  about the specific document
 then a response that includes a content-location is giving you the
 information necessary to do that correctly. It's complemented and
 further clarified in the entity body itself through something like
 isDescribedBy.

 I stand corrected, think there's something in this, and it could maybe
 possibly provide the semantic indirection needed when Content-Location is
 there, and different to the effective request uri, and complimented by some
 statements (perhaps RDF in the body, or Link header, or html link element)
 to assert the same.

 Covers a few use-cases, might have legs (once HTTP-bis is a standard?).

 Nicely caught Mike!

+1 This is precisely what we need.

Ian

Hash vs Slash in relation to the 303 vs 200 debate (was: Is 303 really necessary - demo)

2010-11-06 Thread Ian Davis

On Fri, Nov 5, 2010 at 5:28 PM, Nathan nat...@webr3.org wrote:
 URI resolution is essentially:

  dereference( uri.toAbsolute() );

 Which gives us the simplicity and semantic indirection which we need. Use
 frags, forget HTTP, know that uri#frag is never going to be a document
 (unless you explicitly say it is).


On a practical level using frags can be inefficient when your linked
data output is backed by a triple store. If you use a slash URI then
generating the data for html/xml/turtle output is just a simple
describe uri. For hash URIs you need to describe all the resources
with a common prefix because the fragment is not sent by the browser
with the request. That might mean a filter with a regex or string
functions which will be more inefficient. If you pick a standard
fragment such as #this, #it etc then you can revert to the simple
describe so the inefficiency only arises when there are multiple
arbitrary fragments per document URI.

The other downside of fragments is you can't say it exists but I have
no description of it. With standard slash URIs you can 303 to a 404
to say that. You can 404 on the slash itself to say the resource does
not exist. With my proposal to use 2xx responses you can return 204 No
Content to indicate the resource exists but no description is
available.

With slashes you can use 410 on an individual resource to indicate
that it has gone forever. You can also do this with the one frag per
doc approach although you are really saying the description document
has gone and the user is left to imply the secondary resource has also
gone. With multiple frags per doc (i.e. a lot of schemas) you can't
say just one of those resources has gone forever.

In summary:

Slash with 303: hard static publishing, efficient dynamic, can ack
existence without description

Hash, one resource per doc: easy static publishing, efficient dynamic,
can't ack existence without description

Hash, many resources per doc (the typical schema case): easy static
publishing, less efficient dynamic, can't ack existence without
description

Slash with 2xx: easy static publishing, efficient dynamic, can ack
existence without description

Ian

Is 303 really necessary - demo

2010-11-05 Thread Ian Davis

Hi all,

To aid discussion I create a small demo of the idea put forth in my
blog post http://iand.posterous.com/is-303-really-necessary

Here is the URI of a toucan:

http://iandavis.com/2010/303/toucan

Here is the URI of a description of that toucan:

http://iandavis.com/2010/303/toucan.rdf

As you can see both these resources have distinct URIs.

I created a new property http://vocab.org/desc/schema/description to
link the toucan to its description. The schema for that property is
here:

http://vocab.org/desc/schema

(BTW I looked at the powder describedBy property and it's clearly
designed to point to one particular type of description, not a general
RDF one. I also looked at
http://ontologydesignpatterns.org/ont/web/irw.owl and didn't see
anything suitable)

Here is the URI Burner view of the toucan resource and of its
description document:

http://linkeddata.uriburner.com/about/html/http://iandavis.com/2010/303/toucan

http://linkeddata.uriburner.com/about/html/http/iandavis.com/2010/303/toucan.rdf

I'd like to use this demo to focus on the main thrust of my question:
does this break the web  and if so, how?

Cheers,

Ian

P.S. I am not fully caught up on the other thread, so maybe someone
has already produced this demo

Re: Is 303 really necessary?

2010-11-05 Thread Ian Davis

On Fri, Nov 5, 2010 at 9:54 AM, William Waites w...@styx.org wrote:
 Provenance and debugging. It would be quite possible to
 record the fact that this set of triples, G, were obtained
 by dereferencing this uri N, at a certain time, from a
 certain place, with a request that looked like this and a
 response that had these headers and response code. The
 class of information that is kept for [0]. If N appeared
 in G, that could lead directly to inferences involving the
 provenance information. If later reasoning is concerned at
 all with the trustworthiness or up-to-dateness of the
 data it could look at this as well.

 Keeping this quantity of information around might quickly
 turn out to be too data-intensive to be practical, but
 that's more of an engineering question. I think it does
 make some sense to do this in principle at least.


All the above would remain in my proposal. If you were in fact
inferring triples from the 303 then those would already be in the data
you are dereferencing.

Ian

Re: Is 303 really necessary?

2010-11-05 Thread Ian Davis

On Fri, Nov 5, 2010 at 10:12 AM, Nathan nat...@webr3.org wrote:
 What's the point in you saying:

  /toucan a :Toucan; :describedBy /doc .

 If the rest of the world is saying:

  /toucan a :Document; :primaryTopic ex:Toucan .

 Follow?


Because the data obtained by dereferencing /toucan is authoratative?

Ian

Re: Is 303 really necessary?

2010-11-05 Thread Ian Davis

On Fri, Nov 5, 2010 at 10:05 AM, Nathan nat...@webr3.org wrote:

 Not at all, I'm saying that if big-corp makes a /web crawler/ that describes
 what documents are about and publishes RDF triples, then if you use 200 OK,
 throughout the web you'll get (statements similar to) the following
 asserted:

  /toucan :primaryTopic dbpedia:Toucan ; a :Document .

i don't think so. If the bigcorp is producing triples from their crawl
then why wouldn't they use the triples they are sent (and/or
content-location, link headers etc). The above looks like what you'd
get from a third party translation of the crawl results without the
context of actually having fetched the data from the URI.

If the bigcorp is not linked data aware then today they will follow
the 303 redirect as a standard HTTP redirect. rfc2616 says that the
target URI is not a substitute for the original URI but just an
alternate location to get a response from. The bigcorp will simply
infer the statements you list above **even though there is a 303
redirect**.

As rfc2616 itself points out, many user agents treat 302 and 303
interchangeably. Only linked data aware agents will ascribe special
meaning to 303 and they're the ones that are more likely to use the
data they are sent.

Ian

Re: Is 303 really necessary?

2010-11-05 Thread Ian Davis

On Fri, Nov 5, 2010 at 10:34 AM, Nathan nat...@webr3.org wrote:
 and if I publish:

  http://webr3.org/nathan#me :isKingOf :TheWorld .

 it's authorative and considered true?

 great news all round :)

No :)

I mean't that when you dereference
http://iandavis.com/2010/303/toucan, the triples you get about
http://iandavis.com/2010/303/toucan can be considered authoritative.
For me, that's one of the principle advantages of linked data over
other data formats - built in provenance if you will.

Also authoritative does not mean true. I means it asserts authority
over the data. That could very well be wrong, but it's the consumer's
choice whether they trust that authority or not (or can prove it
perhaps).

Ian

Re: Is 303 really necessary?

2010-11-05 Thread Ian Davis

Hi David,

Rather than respond to each of your points let me say that I agree
with most of them :) I have snipped away the things I agree with in
principle, and left the things I want to discuss further.

I have a question about  http://thing-described-by.org/ - how does it
work when my description document describes multiple things? Really,
any RDF document that references more than one resource as a subject
or object can be considered to be providing a description of all those
resources.

On Thu, Nov 4, 2010 at 10:10 PM, David Booth da...@dbooth.org wrote:
  2. only one description can be linked from the
toucan's URI

 True, but that's far better than zero, if you only
 have the toucan URI and it returns 404!

It could return 204.



  3. the user enters one URI into their browser and ends
up at a different one, causing confusion when they want to
reuse the URI of the toucan. Often they use the document
URI by mistake.

 Yes, that's a problem.  The trade-off is ambiguity.

I don't think so. The ambiguity is not present because the data
explicitly distinguishes the two URIs (and content-location header
does too).


  7. it mixes layers of responsibility - there is
information a user cannot know without making a network
request and inspecting the metadata about the response
to that request. When the web server ceases to exist then
that information is lost.

 I don't buy this argument.  While I agree that
 explicit statements such as

  Utoucan :isDescribedBy Upage .

 is helpful and should be provided, that does *not*
 mean that links are not *also* useful.  Just because
 links do not *always* work does not mean that they
 are useless.

But you agree that under the current scheme, some things are knowable
only by making a network request. It's not enough to have just the RDF
description document?

Cheers,

Ian

Re: Is 303 really necessary?

2010-11-05 Thread Ian Davis

On Fri, Nov 5, 2010 at 10:57 AM, Nathan nat...@webr3.org wrote:
 I'll roll with the who cares line of thinking, I certainly don't care how
 you or dbpedia or foaf or dc publish your data, so long as I can deref it,
 but for god sake don't go telling everybody using slash URIs and 200 is The
 Right Thing TM

Sure. We don't want to restrict choice or options. I am focussed here
only on simplifying a common pattern.

Ian

Re: Is 303 really necessary - demo

2010-11-05 Thread Ian Davis

On Fri, Nov 5, 2010 at 1:53 PM, Jörn Hees j_h...@cs.uni-kl.de wrote:
 If I GET http://iandavis.com/2010/303/toucan i retrieve a document (I'll call
 this A) with rdf statements.

This is not correct. You receive a response with an entity: the
representation. (Here entity is used in the rfc2616 sense)

 If I GET http://iandavis.com/2010/303/toucan.rdf i retrieve another document
 (I'll call this B), which in this case happens to have the same content as A,
 but could be different, can't it?

I could return a different entity, but I wouldn't recommend it. You
might want to if you link to multiple descriptions of the resource.


 Now: how can I say that I don't like A without saying that I don't like
 http://iandavis.com/2010/303/toucan ?

 If your answer is going to be say you don't like B again, please explain
 what happens if A and B don't have the same content.


How do you currently refer to the entity transmitted in a HTTP
response? You don't - they have no names. How do you say you are
offended by something written on twitter's home page when the entity
it sends changes every second.

 Is there some magic involved saying that any ?s with a
 ?s http://vocab.org/desc/schema/description ?d .
 is not a document but a real-world object?

No. But the description document and the entity returned from the
request to /toucan says it's a dbp:Toucan. I would put more credence
in explicit statements than implicit ones.


 Or is there some magic involved that if toucan and toucan.rdf give you the
 same content that one of them is a real-world object then?


No.


 If not, how can I find out that http://iandavis.com/2010/303/toucan is one
 and A is only one of its descriptions?


Look at the data - it states it clearly.

 Jörn

 PS: is there a summary of this discussion somewhere?



I'm afraid not, it's only been going a few hours. I haven't seen
anything that fundamentally challenges the idea yet, i.e. something
that would make me rewrite it. I am seeing several responses arguing
that it confuses the thing with the document but I explicitly show how
it doesn't in the blog post so I think that comes from people's
default assumptions. There are some responses saying don't do that -
use fragments instead, but that's no help for the millions of
resources already deployed with slash uris and the many people who
prefer that style. Other responses have been like yours, seeking more
clarity on the ideas.

Ian

Re: What would break, a question for implementors? (was Re: Is 303 really necessary?)

2010-11-05 Thread Ian Davis

On Fri, Nov 5, 2010 at 12:12 PM, Nathan nat...@webr3.org wrote:
 However, if you use 303's the then first GET redirects there, then you store
 the ontology against the redirected-to URI, you still have to do 40+ GETs
 but each one is fast with no response-body (ontology sent down the wire)
 then the next request for the 303'd to URI comes right out of the cache.
 It's still 40+ requests unless you code around it in some way, but it's
 better than 40+ requests and 40+ copies of the single ontology.

But in practice, don't you look in your cache first? If you already
have a label for foaf:knows because you looked up foaf:mbox a few
seconds ago why would you issue another request?

Ian

Re: Is 303 really necessary?

2010-11-05 Thread Ian Davis

On Fri, Nov 5, 2010 at 12:11 PM, Norman Gray nor...@astro.gla.ac.uk wrote:
 httpRange-14 requires that a URI with a 200 response MUST be an IR; a URI 
 with a 303 MAY be a NIR.

 Ian is (effectively) suggesting that a URI with a 200 response MAY be an IR, 
 in the sense that it is defeasibly taken to be an IR, unless this is 
 contradicted by a self-referring statement within the RDF obtained from the 
 URI.

Thank you for writing this - it's exactly what I mean.

Ian

Re: isDefinedBy and isDescribedBy, Tale of two missing predicates

2010-11-05 Thread Ian Davis

Kingsley,

 My only gripe is with mutual exclusion. ..dropping 303... didn't come
 across as adding an option to the mix. Ditto positioning 303 as a mandate,
 which it's never really been.

I think you read too much conspiracy into 140 characters.

Ian

Re: isDefinedBy and isDescribedBy, Tale of two missing predicates

2010-11-05 Thread Ian Davis

On Fri, Nov 5, 2010 at 3:29 PM, Nathan nat...@webr3.org wrote:
 Better clear that up, noticed that it's an age old XHTML-RDFa potential
 issue, so I'll see if we can get it covered in the WG and relay back to the
 TAG to hopefully clear the issue.

Suppose I assign the ID 'mars' to represent the planet mars in my RDFa
document. I can then refer to it using http://example.com/foo#mars.
What does it mean when my javascript calls
document.getElementById('mars')? Should I expect now to manipulate the
planet mars?

This is an analagous dilemma as for slash URIs except the domain is
html + javascript rather than http.

Just like your claimed slash URI problems, in practice it is a
non-issue because people don't really expect to manipulate planets
with javascript. Some dumb machine might assume that in the future but
 they are going to make a lot of similar mistakes that will cause a
lot more problems.


Ian

Re: Is 303 really necessary - demo

2010-11-05 Thread Ian Davis

On Fri, Nov 5, 2010 at 4:42 PM, Robert Fuller robert.ful...@deri.org wrote:
 I submitted both urls to sindice earlier. Both were indexed and have the
 same content. In the search results[1] one displays with title A Toucan,
 the other with title, A Description of a Toucan.

 http://sindice.com/search?q=toucan+domain%3Aiandavis.comqt=term


So SIndice see them as distinct resources and doesn't concern itself
with the lack of a 303 redirect?

Ian

Is 303 really necessary?

2010-11-04 Thread Ian Davis

Hi all,

The subject of this email is the title of a blog post I wrote last
night questioning whether we actually need to continue with the 303
redirect approach for Linked Data. My suggestion is that replacing it
with a 200 is in practice harmless and that nothing actually breaks on
the web. Please take a moment to read it if you are interested.

http://iand.posterous.com/is-303-really-necessary

Cheers,

Ian

Re: Is 303 really necessary?

2010-11-04 Thread Ian Davis

On Thu, Nov 4, 2010 at 2:13 PM, Kingsley Idehen kide...@openlinksw.com wrote:
 Ian,

 Q: Is 303 really necessary?

 A: Yes, it is.

 Why? Read on...

I don't think you explain this in your email.


 What's the problem with having many options re. mechanics for associating an
 HTTP based Entity Name with a Descriptor Resource Address?

Do you mean associate a resource with a description? Or do you mean
something else? Can you rephrase using the terminology that everyone
else uses please.



 We shouldn't be narrowing options for implementing the fundamental essence
 of Linked Data -- hypermedia based data representation. Of course, we can
 discuss and debate individual, product, or organization preferences etc..
 But please lets not push these as mandates. We should never mandate that
 303's are bad, never. Its an implementation detail, no more no less.


I'm suggesting that we relax a mandate to always use 303 and since
you're saying we must not narrow options then you seem to be
supporting my suggestion,

 The only thing that should be mandatory re. Linked Data is this:  HTTP based
 Entity Names should Resolve to structured Descriptors that are Human and/or
 Machine decipherable.

Are you saying that requesting a URI should return a description document?



 Ironically, bearing in mind my comments, we do arrive at the same
 conclusion, but in different ways. I phrase my conclusion as: heuristics for
 implementing HTTP based Entity Names that Resolve to structured Descriptor
 Resources shouldn't dominate the Linked Data narrative, especially as
 comprehension of the fundamental concept remains mercurial.


So are you contradicting your answer at the start of the post?


 --

 Regards,

 Kingsley Idehen   

Ian

Re: Is 303 really necessary?

2010-11-04 Thread Ian Davis

On Thu, Nov 4, 2010 at 3:00 PM, Kingsley Idehen kide...@openlinksw.com wrote:
 On 11/4/10 10:22 AM, Ian Davis wrote:

 On Thu, Nov 4, 2010 at 2:13 PM, Kingsley Idehenkide...@openlinksw.com
  wrote:

 Ian,

 Q: Is 303 really necessary?

 A: Yes, it is.

 Why? Read on...

 I don't think you explain this in your email.

 What's the problem with having many options re. mechanics for associating
 an
 HTTP based Entity Name with a Descriptor Resource Address?

 Do you mean associate a resource with a description? Or do you mean
 something else? Can you rephrase using the terminology that everyone
 else uses please.


 Who is everyone else? How about the fact that terminology that you presume
 to be common is actually uncommon across broader spectrum computing.


I don't presume. I prefer to use terms that are familiar with the
people on this list who might be reading the message. Introducing
unnecessary capitalised phrases distracts from the message.


 Anyway, translation:

 What's the problem with having a variety of methods for using LINKs to
 associate a Non Information Resource with an Information Resource that
  describes it (i.e., carries its structured representation)? Why place an
 implementation detail at the front of the Linked Data narrative?

It's already at the front, and as I say in my post it's an impediment
to using Linked Data by mainstream developers. This is an
implementation detail that I think could do with improving, making it
simpler and in fact removing it from the front of the narrative. It
just becomes like commonplace web publishing. Do you agree that's a
good goal to strive for?



 We shouldn't be narrowing options for implementing the fundamental
 essence
 of Linked Data -- hypermedia based data representation. Of course, we can
 discuss and debate individual, product, or organization preferences etc..
 But please lets not push these as mandates. We should never mandate that
 303's are bad, never. Its an implementation detail, no more no less.

 I'm suggesting that we relax a mandate to always use 303 and since
 you're saying we must not narrow options then you seem to be
 supporting my suggestion,

 I didn't know there was a mandate to always use 303. Hence my comments.

There is. I find it surprising that you're unaware of it because it's
in all the primary documents about publishing Linked Data.



 The only thing that should be mandatory re. Linked Data is this:  HTTP
 based
 Entity Names should Resolve to structured Descriptors that are Human
 and/or
 Machine decipherable.

 Are you saying that requesting a URI should return a description document?

 Resolve to a Descriptor Document which may exist in a variety of formats.
 Likewise, Descriptor documents (RDF docs, for instance) should clearly
 identify their Subject(s) via HTTP URI based Names.


 Example (in this example we have 1:1 re. Entity Name and Descriptor for sake
 of simplicity):

 http://dbpedia.org/resource/Paris -- Name
 http://dbpedia.org/page/Paris -- Descriptor Resource (HTML+RDFa) this
 resource will expose other representations via head/ (link/ + @rel) or
 Link: in response headers etc..

Not sure what you are trying to say here. I must be misunderstanding
because you appear to be claiming that
http://dbpedia.org/resource/Paris is a name but
http://dbpedia.org/page/Paris is a resource.

Assuming you are using angle brackets like they are used in Turtle
then I think they are both resources.

I would say:

http://dbpedia.org/resource/Paris -- a resource named by the string
http://dbpedia.org/resource/Paris;
http://dbpedia.org/page/Paris -- a resource named by the string
http://dbpedia.org/page/Paris;

Also, in my view the first resource is actually the city of paris
whereas the second is a document about the first resource.

I don't really see what relevance this all has to the issue of 303
redirection though. We are all agreed that things are not usually
their own descriptions, we are discussing how that knowledge should be
conveyed using Linked Data.



 Ironically, bearing in mind my comments, we do arrive at the same
 conclusion, but in different ways. I phrase my conclusion as: heuristics
 for
 implementing HTTP based Entity Names that Resolve to structured
 Descriptor
 Resources shouldn't dominate the Linked Data narrative, especially as
 comprehension of the fundamental concept remains mercurial.

 So are you contradicting your answer at the start of the post?

 Huh?

 I am saying, what I've already stated: heuristics re. essence of Linked Data
 mechanics shouldn't front the conversation. You sort of arrive there too,
 but we differ re. mandates.

See my  comment above: I am removing them from the front.


 Potential point of reconciliation:

 You assumed that 303 is an existing mandate. I am totally unaware of any
 such mandate.

See above.


 I don't even buy into HTTP scheme based Names as a mandate, they simply make
 the most sense courtesy of Web ubiquity.  As is already the case re., LINK

Re: Is 303 really necessary?

2010-11-04 Thread Ian Davis

On Thu, Nov 4, 2010 at 3:21 PM, Giovanni Tummarello
giovanni.tummare...@deri.org wrote:
 Hi Ian

 no its not needed see this discussion
 http://lists.w3.org/Archives/Public/semantic-web/2007Jul/0086.html
 pointing to 203 406 or thers..

 ..but a number of social community mechanisms will activate if you
 bring this up, ranging from russian style you're being antipatriotic
 criticizing the existing status quo  to ..but its so deployed now
 and .. you're distracting the community from other more important
 issues , none of this will make sense if analized by proper logical
 means of course (e.g. by a proper IT manager in a proper company, paid
 based on actual results).

Yes, but I guess I have to face those to make progress.


 But the core of the matter really is : who cares. My educated guess
 looking at Sindice flowing data is that everyday out of 100 new sites
 on  web of data 99.9 simply use RDFa which doesnt have this issue.


I think it's an orthogonal issue to the one RDFa solves. How should I
use RDFa to respond to requests to http://iandavis.com/id/me which is
a URI that denotes me?

 choose how to publish yourself but here is another one. If you chose
 NOT to use RDFa you will miss out on anything which will enhance the
 user experience based on annotations. As an example see our entry in
 the  semantic web challange [1].

I'm agnostic on formats, just trying to make things simpler for
publishers who want to use hashless URIs in their data.

Ian


 Giovanni

 [1] http://www.cs.vu.nl/~pmika/swc/submissions/swc2010_submission_19.pdf



 On Thu, Nov 4, 2010 at 2:22 PM, Ian Davis m...@iandavis.com wrote:
 Hi all,

 The subject of this email is the title of a blog post I wrote last
 night questioning whether we actually need to continue with the 303
 redirect approach for Linked Data. My suggestion is that replacing it
 with a 200 is in practice harmless and that nothing actually breaks on
 the web. Please take a moment to read it if you are interested.

 http://iand.posterous.com/is-303-really-necessary

 Cheers,

 Ian

Re: Is 303 really necessary?

2010-11-04 Thread Ian Davis

On Thu, Nov 4, 2010 at 3:50 PM, Giovanni Tummarello
giovanni.tummare...@deri.org wrote:
 I think it's an orthogonal issue to the one RDFa solves. How should I
 use RDFa to respond to requests to http://iandavis.com/id/me which is
 a URI that denotes me?


 hashless?

 mm one could be to return HTML + RDFa describing yourself. add a
 triple saying http://iandavis.com/id/me
 containstriplesonlyabouttheresourceandnoneaboutitselfasinformationresource


Yes, that's basically what I'm saying in my blog post.


 its up to clients to really care about the distinction, i personally
 know of no useful clients for the web of data that will visibly
 misbehave if a person is mistaken for a page.. so your you can certify
 to your customer your solution works well with any client


Good to know. That's my sense too.



 if one will come up which operates usefully on both people and pages
 and would benefit from making your distinction than those coding that
 client will definitely learn about your
 containstriplesonlyabouttheresourceandnoneaboutitselfasinformationresource
 and support it.

 how about this ? :-)

Sounds good to me :)


 as an alternative the post i pointed you earlier (the one about 203
 406) did actually contain an answer i believe.  406 is perfect IMO ..
 I'd say a client which will care to make the distinction would learn
 to support it as in my previous example.


I'll look into that.

Ian

Re: Is 303 really necessary?

2010-11-04 Thread Ian Davis

On Thu, Nov 4, 2010 at 3:56 PM, Kingsley Idehen kide...@openlinksw.com wrote:
 I don't presume. I prefer to use terms that are familiar with the
 people on this list who might be reading the message. Introducing
 unnecessary capitalised phrases distracts from the message.

 Again, you presume. Capitalization might not work for you, but you are not
 the equivalent of an entire mailing list audience. You are one individual
 entitled to a personal opinion and preferences.


I hope you agree i have the freedom to express those opinions.



 Anyway, translation:

 What's the problem with having a variety of methods for using LINKs to
 associate a Non Information Resource with an Information Resource
 that
  describes it (i.e., carries its structured representation)? Why place an
 implementation detail at the front of the Linked Data narrative?

 It's already at the front, and as I say in my post it's an impediment
 to using Linked Data by mainstream developers.

 I don't believe its already at the front. I can understand if there was some
 quasi mandate that put it at the front. Again, you are jumping to
 conclusions, then pivoting off the conclusions to make a point. IMHO: Net
 effect, Linked Data concept murkiness and distraction. You are inadvertently
 perpetuating a misconception.

Thank you for your opinion. I don't believe I am jumping to conclusions.


 There is. I find it surprising that you're unaware of it because it's
 in all the primary documents about publishing Linked Data.

 Please provide a URL for the document that establishes this mandate. I know
 of no such document. Of course I am aware of documents that offer
 suggestions and best practice style guidelines.

Here is one cited by Leigh just now: http://www.w3.org/TR/cooluris/

Also http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html

And http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/



 The only thing that should be mandatory re. Linked Data is this:  HTTP
 based
 Entity Names should Resolve to structured Descriptors that are Human
 and/or
 Machine decipherable.

 Are you saying that requesting a URI should return a description
 document?

 Resolve to a Descriptor Document which may exist in a variety of formats.
 Likewise, Descriptor documents (RDF docs, for instance) should clearly
 identify their Subject(s) via HTTP URI based Names.

 Example (in this example we have 1:1 re. Entity Name and Descriptor for
 sake
 of simplicity):

 http://dbpedia.org/resource/Paris  -- Name
 http://dbpedia.org/page/Paris  -- Descriptor Resource (HTML+RDFa) this
 resource will expose other representations viahead/  (link/  + @rel)
 or
 Link: in response headers etc..

 Not sure what you are trying to say here. I must be misunderstanding
 because you appear to be claiming that
 http://dbpedia.org/resource/Paris  is a name but

 That is a Name via HTTP URI (using its Name aspect).

This is an interesting distinction between the resource and a name.
Can you restate it in a new thread so we don't add noise to the 303
discussion



 I don't really see what relevance this all has to the issue of 303
 redirection though. We are all agreed that things are not usually
 their own descriptions, we are discussing how that knowledge should be
 conveyed using Linked Data.

 Of course, my comments are irrelevant, off topic. If that works for you,
 then good for you. You spent all this time debating an irrelevance.

That looks like a natural close to this particular part of the debate then.


 FWIW - 303 is an implementation detail, RDF is an implementation detail, and
 so is SPARQL. When you front line any conversation about the concept of
 Linked Data with any of the aforementioned, you are only going to make the
 core concept incomprehensible.



Ian

Re: Is 303 really necessary?

2010-11-04 Thread Ian Davis

On Thu, Nov 4, 2010 at 4:17 PM, Kingsley Idehen kide...@openlinksw.com wrote:
 On 11/4/10 11:50 AM, Giovanni Tummarello wrote:

 its up to clients to really care about the distinction, i personally
 know of no useful clients for the web of data that will visibly
 misbehave if a person is mistaken for a page.. so your you can certify
 to your customer your solution works well with any client

 Gio,

 Keyword: visibly.

 Once the Web of Linked Data crystallizes, smart agents will emerge and start
 roaming etc.. These agents need precision, so ambiguity will cause problems.
 At this point there will be broader context for these matters. Please don't
 dismiss this matter, things are going to change quickly, we live in
 exponential times.

If the success of these agents is predicated on precision then they
are doomed to failure. The web is a messy place but it's precisely
that messiness that allows it to scale. Anyone building serious web
data apps is used to dealing with ambiguity all the time and has
strategies for compensating. Linked Data offers a route to higher
precision, but in no way is it a panacea or silver bullet.

Ian

Re: Is 303 really necessary?

2010-11-04 Thread Ian Davis

On Thu, Nov 4, 2010 at 4:52 PM, Robin YANG yang.squ...@gmail.com wrote:
 Ok, yes, we can use ontology or ex:isDescribedBy, but none of solution
 explained what happens when you dereferencing the URI over HTTP which you
 just used to refer to the non-information resources.
 Don't u need 303 or hash URI again to differentiate when
 dereferencing whatever subject URI we minted before ex:isDescribedBy.


When you dereference the URI you get back a representation with some
data about the thing the URI denotes. I don't think you need any other
URI unless you also want to assign a URI to the representation itself.

Ian

Re: Is 303 really necessary?

2010-11-04 Thread Ian Davis

Hi Dave,

On Thu, Nov 4, 2010 at 4:56 PM, David Wood da...@3roundstones.com wrote:
 Hi all,

 This is a horrible idea, for the following reasons (in my opinion and 
 suitably caveated):

 - Some small number of people and organizations need to provide back-links on 
 the Web since the Web doesn't have them.  303s provide a generic mechanism 
 for that to occur.  URL curation is a useful and proper activity on the Web, 
 again in my opinion.

The relationship between 303 redirection and backlinks isn't clear to
me. Can you expand?


 - Overloading the use of 200 (OK) for metadata creates an additional 
 ambiguity in that the address of a resource is now conflated with the address 
 of a resource described by metadata.

My post addresses that case. I don't encourage people to use the same
URI for both the metadata and the thing but to link them using a new
predicate ex:isDescribedBy. I also say that you should believe the
data. If the data says the thing you dereferenced is a document then
that's what you should assume it is. If it says it's a toucan then
that's what it is.



 - W3C TAG findings such as http-range-14 are really very difficult to 
 overcome socially.


Maybe so, but I don't think that should stop 5 years of deployment
experience from informing a change of practice. This isn't really
relevant to my main question though: what breaks on the web.


 - Wide-spread mishandling of HTTP content negotiation makes it difficult if 
 not impossible to rely upon.  Until we can get browser vendors and server 
 vendors to handle content negotiation in a reasonable way, reliance on it is 
 not a realistic option.  That means that there needs to be an out-of-band 
 mechanism to disambiguate physical, virtual and conceptual resources on the 
 Web.  303s plus http-range-14 provide enough flexibility to do that; I'm not 
 convinced that overloading 200 does.

My proposal isn't dependent on conneg. You can use it with the same
caveats as anywhere else. But the simple case is just to serve up some
RDF at the URI being dereferenced. BTW, conneg is very widely deployed
in the Linked Data web and doesn't seem to have been a problem.



 /me ducks for the inevitable mud slinging this list has become.

We can improve the quality of discussion on this list.



 Regards,
 Dave

Ian

Re: Is 303 really necessary?

2010-11-04 Thread Ian Davis

On Thursday, November 4, 2010, Nathan nat...@webr3.org wrote:

 Please, don't.

 303 is a PITA, and it has detrimental affects across the board from network 
 load through to server admin. Likewise #frag URIs have there own set of PITA 
 features (although they are nicer on the network and servers).

 However, and very critically (if you can get more critical than critical!), 
 both of these patterns / constraints are here to ensure that  different 
 things have different names, and without that distinction our data is junk.


I agree with this and I address it in my blog post where I say we
should link the thing to its description using a triple rather than a
network response code.



 This goes beyond your and my personal opinions, or those of anybody here, the 
 constraints are there so that in X months time when multi-corp trawls the 
 web, analyses it and releases billions of statements saying like { /foo 
 :hasFormat x; sioc:about dbpedia:Whatever } about each doc on the web, that 
 all of those statements are said about documents, and not about you or I, or 
 anything else real, that they are said about the right thing, the correct 
 name is used.

I don't see that as a problem. It's an error because it's not what the
original publisher intended but there are many many examples where
that happens in bulk, and actually the 303 redirect doesn't prevent it
happening with naive crawlers.

If someone asserts something we don't have to assume it is
automatically true. We can get authority about what a URI denotes by
dereferencng it. We trust third party statements as much or as little
as we desire.



 And this is critically important, to ensure that in X years time when 
 somebody downloads the RDF of 2010 in a big *TB sized archive and considers 
 the graph of RDF triples, in order to make sense of some parts of it for 
 something important, that the data they have isn't just unreasonable junk.

Any decent reasoner at that scale will be able to reject triples that
appear to contradict one another. Seeing properties such as format
against a URI that everyone else claims denotes an animal is going to
stick out.


 It's not about what we say something is, it's about what others say the thing 
 is, and if you 200 OK the URIs you currently 303, then it will be said that 
 you are a document, as simple as that. Saying you are a document isn't the 
 killer, it's the hundreds of other statements said along side that which make 
 things so ambiguous that the info is useless.


That's only true under the httpRange-14 finding which I am proposing
is part of the problem.



 If 303s are killing you then use fragment URIs, if you refuse to use 
 fragments for whatever reason then use something new like tdb:'s, support the 
 data you've published in one pattern, or archive it and remove it from the 
 web.

These are publishing alternatives, but I'm focussed on the 303 issue here.



 But, for whatever reasons, we've made our choices, each has pro's and cons, 
 and we have to live with them - different things have different name, and the 
 giant global graph is usable. Please, keep it that way.


Agree, different things have different names, that's why I emphasise
it in the blog post. I don't agree that the status quo is the best
state of affairs.


 Best,

 Nathan



Ian

Re: Is 303 really necessary?

2010-11-04 Thread Ian Davis

On Thu, Nov 4, 2010 at 6:08 PM, Nathan nat...@webr3.org wrote:

 You see it's not about what we say, it's about what other say, and if 10
  huge corps analyse the web and spit out billions of triples saying that
 anything 200 OK'd is a document, then at the end when we consider the RDF
 graph of triples, all we're going to see is one statement saying something
 is a nonInformationResource and a hundred others saying it's a document
 and describing what it's about together with it's format and so on.

 I honestly can't see how anything could reason over a graph that looked like
 that.

I honestly believe that's the least of our worries. How often do you
need to determine whether something in the universe of discourse is an
electronic document or not compared with all the other questions you
might be asking of your data. I might conceivable ask show me all the
documents about this toucan but I'd much rather ask show me all the
data about this toucan



 However, I'm also very aware that this all may be moot any ways, because
 many crawlers and HTTP agents just treat HTTP like a big black box, they
 don't know there ever was a 303 and don't know what the end URI is (even
 major browser vendors like chrome do this, setting the base wrong and
 everything) - so even the current 303 pattern doesn't keep different things
 with different names for /slash URIs in all cases.


That's true. I don't suppose any of the big crawlers care about the
semantics of 303 because none of them care about the difference
between a thing and its description. For example the Google OpenSocial
doesn't give a hoot about the difference and yet seems to still
function. As I say above, this document/thing distinction is actually
quite small area to focus on compared with the the real problems of
analysing the web of data as a whole.


 Best,

 Nathan


Ian

Re: Is 303 really necessary?

2010-11-04 Thread Ian Davis

On Thursday, November 4, 2010, Jörn Hees j_h...@cs.uni-kl.de wrote:
 Hi Ian,

 From your blogpost:
 Under my new scheme:
 GET /toucan responds with 200 and a representation containing some RDF which
 includes the triples /toucan ex:owner /anna and /toucan
 ex:isDescribedBy /doc
 GET /doc responds with 200 and a representation containing some RDF which
 includes the triple /doc ex:owner /fred

 So how can I then say that your description of toucan is wrong without saying
 that the poor toucan is wrong?


Use the URI of the document /doc

 Jörn



Ian

Re: Please allow JS access to Ontologies and LOD

2010-10-23 Thread Ian Davis

On Sat, Oct 23, 2010 at 2:28 AM, Nathan nat...@webr3.org wrote:
 Hi Ian,

 Thanks, I can confirm the change has been successful :)

 However, one small note is that the conneg URIs such as
 http://productdb.org/gtin/00319980033520 do not expose the header, thus
 can't be used.

Ta. These should be emitting the header now.

Ian

Re: Please allow JS access to Ontologies and LOD

2010-10-22 Thread Ian Davis

Hi Nathan,

I implemented this header on http://productdb.org/ (since I had the
code open). Can someone comfirm that it does what's expected (i.e.
allows off-domain requesting of data from productdb.org)

One important thing to note. The PHP snippet you gave was slightly
wrong. The correct form is:

header(Access-Control-Allow-Origin: *);

Cheers,

Ian


On Sat, Oct 23, 2010 at 12:04 AM, Nathan nat...@webr3.org wrote:
 Hi All,

 Currently nearly all the web of linked data is blocked from access via
 client side scripts (javascript) due to CORS [1] being implemented in the
 major browsers.

 Whilst this is important for all data, there are many of you reading this
 who have it in your power to expose huge chunks of the RDF on the web to JS
 clients, if you manage any of the common ontologies or anything in the LOD
 cloud diagram, please do take a few minutes from your day to expose the
 single http header needed.

 Long story short, to allow js clients to access our open data we need to
 add one small HTTP Response header which will allow HEAD/GET and POST
 requests - the header is:
  Access-Control-Allow-Origin *

 This is both XMLHttpRequest (W3C) and XDomainRequest (Microsoft) compatible
 and supported by all the major browser vendors.

 Instructions for common servers follow:

 If you're on Apache then you can send this header by simply adding the
 following line to a .htaccess file in the dir you want to expose (probably
 site-root):
  Header add Access-Control-Allow-Origin *

 For NGINX:
  add_header Access-Control-Allow-Origin *;
 see: http://wiki.nginx.org/NginxHttpHeadersModule

 For IIS see:
  http://technet.microsoft.com/en-us/library/cc753133(WS.10).aspx

 In PHP you add the following line before any output has been sent from the
 server with:
  header(Access-Control-Allow-Origin, *);

 For anything else you'll need to check the relevant docs I'm afraid.

 Best  TIA,

 Nathan

 [1] http://dev.w3.org/2006/waf/access-control/

Re: Linked Data Movie Quiz

2010-10-18 Thread Ian Davis

Guillermo,

This is a very nice demo, well done!

Is the architecture generic enough to apply to other datasets?
Musicbrainz would be a good one once the new linked data version is
ready.

Cheers,

Ian

2010/10/17 Guillermo Álvaro Rey galv...@isoco.com:
 Dear LODers,

 We have created a simple webapp that generates questions about cinema by
 querying the Linked Movie DataBase ([1], many thanks to Oktie et al. for
 that project and their support).

 The so-called Linked Data Movie Quiz is available at [2], as part of a
 contest where webapps had to be developed in less than 10KB [3].

 We hope that even if simple, and not accessing but a single repository, the
 application is able to showcase the power of using available Linked Data.
 (Indeed, a huge number of questions are automatically generated with very
 little code.) Doing well in the contest proved difficult, for there were
 very nice HTML5 demos in there, but I reckon it was good in terms of Linked
 Data evangelism. :-) (I tried to write during the contest, but somehow the
 email didn't make it through.)

 There are some more details about the LDMQ in [4]. We'll be glad if you try
 it out (advice: less bugs -but still some- if you don't use IE6 or 7) and
 send some feedback.

 Cheers,
 Guillermo (and Jorge)

 [1] http://www.linkedmdb.org/
 [2] http://10k.aneventapart.com/Uploads/310/
 [3] http://10k.aneventapart.com/Entry/310
 [4]
 http://lamboratory.com/blog/2010/08/25/a-linked-data-movie-quiz-the-answers-are-out-there-and-so-are-the-questions/


 --

 Guillermo Álvaro Rey

 Researcher

 galv...@isoco.com

 #T +34 91 334 97 43

 Edificio Testa - Avda. del Partenón 16-18, 1º, 7ª

 Campo de las Naciones

 28042 MADRID



 iSOCO

     enabling the networked economy

     www.isoco.com



  P Please consider your environmental responsibility before printing this
 e-mail

Re: PUBLINK Linked Data Consultancy

2010-10-07 Thread Ian Davis

On Thu, Oct 7, 2010 at 1:00 PM, Michael Schneider schn...@fzi.de wrote:
 Sören Auer wrote:

PS: Please also keep in mind that PUBLINK is very limited (max. 3-5 data
owning organizations) and ca. 10 man days of support for each.

 I think those numbers are the really important bits. I have seen EU projects
 where there were plans to perform really huge field studies. I would
 consider this a problem in this case (not only for existing startups, but
 also for the project consortium :)). But 3-5 organizations sounds fair to me
 and will probably not lead to much conflict with existing companies. Whether
 10 man days will be sufficient is a different question... :)


While, I welcome more  free assistance to linked data adoption, I
think this would be most effective if it were targetted towards
organisations that do not have existing funds to pay for training and
consultancy. At Talis we have encountered several in that situation
and while we help where we can we do have to earn an income. EU funded
help would be perfect for these organisations. Targetting
organisations that would otherwise buy from a commercial company just
undermines a nascent market.

Ian

Re: WordNet RDF

2010-09-09 Thread Ian Davis

On Wed, Sep 8, 2010 at 12:39 PM, Toby Inkster t...@g5n.co.uk wrote:
 Dear all,

 I've created a think RDF wrapper around the WordNet 3.0 database (nouns
 only). For example:

  http://ontologi.es/WordNet/data/Fool

Great work.

There is a SPARQL'able version of Wordnet 3.0 available via the Talis Platform:

http://api.talis.com/stores/wordnet

This is based on the RDF conversion at  http://semanticweb.cs.vu.nl/lod/wn30/

How similar is your work to this version?

Ian

Re: [ANN] Major update of Lexvo.org

2010-07-08 Thread Ian Davis

Hi Bernard,

On Wed, Jul 7, 2010 at 2:04 PM, Bernard Vatant
bernard.vat...@mondeca.com wrote:


 Basically that's it. If this practice seems good from social and technical
 viewpoint it could be a good idea to document it in a more formal way and
 put it somewhare on the wiki. There has been a page set up on the wiki a
 while ago about this issue, sorry can't find now the page address and who
 set it up, and I can't access now to
 http://community.linkeddata.org/MediaWiki/ for some reasons.


This looks like a really good and sensible social process and I'd
support it being more public on the wiki.

Also, if anyone has datasets that they feel they cannot maintain any
more, please consider using the Talis Connected Commons Scheme. We
will host any public domain linked data for free for ever (or at least
for as long as Talis exists which is 40 years so far). This is really
a no-hassle solution with no strings whatsoever, just the commitment
that the data is in the public domain for everyone's reuse. We are
also looking to automatically deposit such datasets with the internet
archive in the future.

Please have a look at http://www.talis.com/platform/cc/ for more
information on this scheme.

Ian

 Looking forward for the feedback

 Bernard



 2010/7/5 Gerard de Melo gdem...@mpi-inf.mpg.de

 Hi everyone,

 We'd like to announce a major update of Lexvo.org [1], a site that brings
 information about languages, words, characters, and other human language-
 related entities to the LOD cloud. Lexvo.org adds a new perspective to the
 Web of Data by exposing how everything in our world is connected in terms
 of language, e.g. via words and names and their semantic relationships.

 Lexvo.org first went live in 2008 just in time for that year's ISWC.
 Recently, the site has undergone a major revamp, with plenty of help from
 Bernard Vatant, who has decided to redirect lingvoj.org's language URIs to
 the corresponding Lexvo.org ones.

 At this point, the site is no longer considered to be in beta testing,
 and we invite you to take a closer look. On the front page, you'll find
 links to examples that will allow you get a feel for the type of
 information being offered. We'd love to hear your comments.

 Best,
 Gerard

 [1] http://www.lexvo.org/

 --
 Gerard de Melodem...@mpi-inf.mpg.de
 Max Planck Institute for Informatics
 http://www.mpi-inf.mpg.de/~gdemelo/







 --
 Bernard Vatant
 Senior Consultant
 Vocabulary  Data Engineering
 Tel:       +33 (0) 971 488 459
 Mail:     bernard.vat...@mondeca.com
 
 Mondeca
 3, cité Nollez 75018 Paris France
 Web:    http://www.mondeca.com
 Blog:    http://mondeca.wordpress.com

Re: RDF Extensibility

2010-07-06 Thread Ian Davis

2010/7/6 Dan Brickley dan...@danbri.org:
 2010/7/6 Jiří Procházka oji...@gmail.com:

 It would have a meaning. It would just be a false statement. The
 same as the following is a false statement:

       foaf:Person a rdf:Property .

 Why do you think so?
 I believe it is valid RDF and even valid under RDFS semantic extension.
 Maybe OWL says something about disjointness of RDF properties and classes
 URI can be many things.

 It just so happens as a fact in the world, that the thing called
 foaf:Person isn't a property. It's a class.


I think that is your view and the view you have codified as the
authoritative definition that I can look up at that URI, but there is
nothing preventing me from making any assertion I like and working
with that in my own environment. If it's useful to me to say
foaf:Person a rdf:Property then I can just do that. However, I
shouldn't expect that assertion to interoperate with other people's
views of the world.

Ian

Re: Show me the money - (was Subjects as Literals)

2010-07-02 Thread Ian Davis

On Fri, Jul 2, 2010 at 4:44 AM, Pat Hayes pha...@ihmc.us wrote:
 Jeremy, your argument is perfectly sound from your company's POV, but not
 from a broader perspective. Of course, any change will incur costs by those
 who have based their assumptions upon no change happening. Your company took
 a risk, apparently. IMO it was a bad risk, as you could have implemented a
 better inference engine if you had allowed literal subjects internally in
 the first place, but whatever. But that is not an argument for there to be
 no further change for the rest of the world and for all future time. Who
 knows what financial opportunities might become possible when this change is
 made, opportunities which have not even been contemplated until now?


I think Jeremy speaks for most vendors that have made an investment in
the RDF stack. In my opinion the time for this kind of low level
change was back in 2000/2001 not after ten years of investment and
deployment. Right now the focus is rightly on adoption and fiddling
with the fundamentals will scare off the early majority for another 5
years. You are right that we took a risk on a technology and made our
investment accordingly, but it was a qualified risk because many of us
also took membership of the W3C to have influence over the technology
direction.

I would prefer to see this kind of effort put into n3 as a general
logic expression system and superset of RDF that perhaps we can move
towards once we have achieved mainstream with the core data expression
in RDF. I'd like to see 5 or 6 alternative and interoperable n3
implementations in use to iron out the problems, just like we have
with RDF engines (I can name 10+ and know of no interop issues between
them)

Ian

Re: destabilizing core technologies: was Re: An RDF wishlist

2010-07-02 Thread Ian Davis

Patrick,

Without disputing your wider point that HTML hit the sweet point of
usability and utility I will dispute the following:

 HTML 3.2 did have:

 1) *A need perceived by users as needing to be met*


Did users really know they wanted to link documents together to form a
world wide web? I spent much of the late nineties persuading companies
and individuals of the merits of being part of this new web thing and
then gritting my teeth when it came to actually showing them how to
get a page online - it was a painful confusion of text editors ( no
you can't use wordperfect ), fumbling in the dark ( no wysiwyg ),
dialup ( you mean I have to pay?)  and ftp! When MS frontpage came
along the users loved it because all that pain went away but they
could not understand why so many people laughed at the results.

I think we all have short memories.

The advantage that HTML had was that people were able to use it before
creating their own, i.e. they were aleady reading websites so could at
some point say I want to make one of those. The problem RDF is
gradually overcoming is this bootstrapping stage. It has a harder time
because, to be frank, data is dull. But now people are seeing some of
the data being made available in browseable form e.g. at data.gov.uk
or dbpedia and saying, I want to make one of those.

Ian

Re: Show me the money - (was Subjects as Literals)

2010-07-02 Thread Ian Davis

On Fri, Jul 2, 2010 at 10:19 AM, Patrick Durusau patr...@durusau.net wrote:

 I make this point in another post this morning but is your argument that
 investment by vendors =


I think I just answered it there, before reading this message. Let me
know if not!

Ian
Ian

Re: Subjects as Literals, [was Re: The Ordered List Ontology]

2010-07-02 Thread Ian Davis

Yves,

On Fri, Jul 2, 2010 at 10:15 AM, Yves Raimond yves.raim...@gmail.com wrote:
 First: this is *not* a dirty hack.

 Brickley bif:contains ckley is a perfectly valid thing to say.


You could, today, use data: URIs to represent literals with no change
to any RDF system.

Ian

Re: Subjects as Literals, [was Re: The Ordered List Ontology]

2010-07-02 Thread Ian Davis

On Fri, Jul 2, 2010 at 8:34 PM, Jeremy Carroll jer...@topquadrant.com wrote:
  On 7/2/2010 12:00 PM, Dan Brickley wrote:

 Or maybe we should all just take a weekend break, mull things over for
 a couple of days, and start fresh on monday? That's my plan anyhow...

 Yeah, maybe some of us could  meet up in some sunny place and sit in an
 office, maybe at Stanford - just like last weekend!

I have to say that meeting was a lot more civilised than the current
raging debate on these lists!


 Jeremy



Ian

Re: Organization ontology

2010-06-07 Thread Ian Davis

On Tue, Jun 1, 2010 at 8:50 AM, Dave Reynolds
dave.e.reyno...@googlemail.com wrote:
 We would like to announce the availability of an ontology for description of
 organizational structures including government organizations.


Congratulations on the publication of this ontology! I've added it to
Schemapedia here:

http://schemapedia.com/schemas/org

I noticed a small semantic typo in the example at the end of section
3. skos:preferredLabel should be skos:prefLabel

Ian

ANNOUNCE: lod-announce list

2010-06-07 Thread Ian Davis

Hi all,

Now we are getting a steady growth in the number of Linked Data sites,
products and services I thought it was time to create a low-volume
announce list for Linked Data related announcements so people can keep
up to date without needing to wade through the LOD discussion.

You can join the list at http://groups.google.com/group/lod-announce

Here is its summary:

A low-traffic, moderated list for announcements about Linked Open Data
and only for announcements. On topic messages include announcements of
new Linked Data sites, data dumps, services, vocabularies, books,
talks, products, tools, events, jobs and conferences with a Linked
Data programme.

You don't need to join the list to post to it, but all posts are
moderated to ensure they stay on-topic.

Please feel free to forward this message to other lists that you think
might be relevant.

Cheers,

Ian

PS  Let me know if you are interested in being a moderator too.

Re: Linking a vCard to its holder

2010-05-14 Thread Ian Davis

Hi you could use

http://open.vocab.org/terms/businessCard

There is also a proposal to add a similar property to FOAF at
http://wiki.foaf-project.org/w/term_businessCard

On Fri, May 14, 2010 at 1:32 PM, Felix Ostrowski
felix.ostrow...@googlemail.com wrote:
 Hi,

 vCards in RDF seem to be a good way to describe people and organizations
 with regards to address information etc. Is there any convention to link a
 vCard to the person or organization it describes (i.e. its holder), e.g.

 http://example.org/me/vcard ?vCardOf? http://example.org/me
 or
 http://example.org/me ?hasvCard? http://example.org/me/vcard

 So far, I couldn't find an established predicate that does so...

 Cheers,

 Felix

 P.S. I find the examples at http://www.w3.org/Submission/vcard-rdf/#Ex to be
 rather misleading.

 v:VCard rdf:about = http://example.com/me/corky;
 ...
 /v:VCard

 is not an assertion I'd make about myself. I am not a vCard.

Re: Cross site scripting: CORS and a Javascript library accessing Linked Data

2010-05-11 Thread Ian Davis

Hi Nathan,

On Mon, May 10, 2010 at 10:49 PM, Nathan nat...@webr3.org wrote:
 Could everybody publishing linked data please note that open data isn't
 currently retrievable via client side JS libraries due to same origin
 policies and the likes.

 In order to make it open and accessible by UAs we need to add in CORS [1]
 headers.


Just to be slightly pedantic, this is only a problem for applications
running inside web browser sandbox contexts. Standalone apps,
dedicated semweb browsers, iPhone apps, greasemonkey scripts etc don't
suffer this limitation.

That said, we are looking at CORS for support by Talis with the caveat
that it is still not a REC stage and we prefer to implement agreed
standards rather than ones in progress unless we're very confident
they won't change.

Ian

Re: DBpedia hosting burden

2010-04-15 Thread Ian Davis

On Wed, Apr 14, 2010 at 8:04 PM, Dan Brickley dan...@danbri.org wrote:

 Bills the major operative word in a world where the Bill Payer and
 Database Maintainer is a footnote (at best) re. perception of what
 constitutes the DBpedia Project.


If dbpedia.org linked to the sparql endpoints of mirrors then that
would be a way of sharing the burden.

Ian

Re: UK Govt RDF Data Sets

2010-04-15 Thread Ian Davis

Kingsley,

You should address your question directly to the project organisers,
we're a technology provider and host some of the data but it is not up
to us when or where the dumps get shared. My understanding is that
because this is officially sanctioned data they want to ensure that
the provenance is built into the datasets properly. My hope and wish
is that the commitment to making dumps available will be built into
the guidelines the UK Government are working on. But those won't be
issued during this month because of the election.

Ian

On Thu, Apr 15, 2010 at 11:19 PM, Kingsley Idehen
kide...@openlinksw.com wrote:
 Ian,

 While on the subject of mirrors and Linked Open Data in general.

 Do you have any idea as to the whereabouts of RDF data sets for the SPARQL
 endpoints associated with data.gov.uk? As you can imagine, I haven't opted
 to crawl your endpoints for the data bearing in LOD community ethos i.e.,
  publish dataset dump locations for SPARQL endpoints that host Linked Open
 Data. This best practice was devised SPARQL endpoint crawling in mind.

 Example:
 http://data.gov.uk/sparql

 Where would I get the actual RDF datasets loaded into the endpoint above?

 Here is the RPI example re. data.gov:

 http://data-gov.tw.rpi.edu/wiki/Data.gov_Catalog_-_Complete .

 --

 Regards,

 Kingsley Idehen       President  CEO OpenLink Software     Web:
 http://www.openlinksw.com
 Weblog: http://www.openlinksw.com/blog/~kidehen
 Twitter/Identi.ca: kidehen

Re: UK Govt RDF Data Sets

2010-04-15 Thread Ian Davis

On Fri, Apr 16, 2010 at 12:09 AM, Kingsley Idehen
kide...@openlinksw.com wrote:
 Ian Davis wrote:

 Kingsley,

 You should address your question directly to the project organisers,
 we're a technology provider and host some of the data but it is not up
 to us when or where the dumps get shared. My understanding is that
 because this is officially sanctioned data they want to ensure that
 the provenance is built into the datasets properly. My hope and wish
 is that the commitment to making dumps available will be built into
 the guidelines the UK Government are working on. But those won't be
 issued during this month because of the election.


 Okay, but the need for dumps is working its way into the fundamental
 guidelines for Linked Open Data.

 As you can imagine (and I have raised these concerns on the UK Govt mailing
 list a few times), this project is high profile and closely associated with
 Linked Open Data; thus, unclarity about these RDF dumps is confusing to say
 the very least.

 Anyway, I am set for now, will wait and see re. what happens post election
 etc..



I should also add that some datasets do not have dumps, e.g. the
reference time and dates

http://reference.data.gov.uk/doc/hour/2010-03-23T21

Ian

Re: Announce: Linked Data Patterns book

2010-04-06 Thread Ian Davis

On Wed, Apr 7, 2010 at 12:14 AM, Peter Ansell ansell.pe...@gmail.com wrote:
 In the Annotation publishing pattern section there is the following statement:

 It is entirely consistent with the Linked Data principles to make
 statements about third-party resources.

 I don't believe that to be true, simply because, unless users are
 always using a quad model (RDF+NamedGraphs), they have no way of
 retrieving that information just by resolving the foreign identifier
 which is the subject of the RDF triple. They would have to stumble on
 the information by knowing to retrieve the object URI, which isn't
 clear from the pattern description so far. In a triples model it is
 harmful to have this pattern as Linked Data, as the statements are not
 discoverable just knowing the URI.


Can you elaborate more on the harm you suggest here?

I don't think we need to limit the data published about a subject to
that subset retrievable at its URI.  (I wrote a little about this last
year at http://blog.iandavis.com/2009/10/more-than-the-minimum )

I also don't believe this requires the use of quads. I think it can be
interlinked using rdfs:seeAlso.


Ian

Re: Linking HTML pages and data

2010-02-17 Thread Ian Davis

On Wed, Feb 17, 2010 at 2:01 AM, Kingsley Idehen kide...@openlinksw.com wrote:
 I really don't believe we achieve much via:
 link rel=primarytopic
 href=http://education.data.gov.uk/id/school/56; /

 primarytopic isn't an IANA registered type link.

Yes, I know. Nor is foaf:primaytopic :)

I think there's a good chance of getting wide adoption for
rel=primarytopic as a pattern / microformat / whatever. Having that
very simple relation would be a massive boost for cross-linking the
document web with the data web, important enough to warrant a special
case IMHO.



 If you absolutely need to use foaf then its better to qualify it:
 link rel=foaf:primarytopic
 href=http://education.data.gov.uk/id/school/56; /

 Yes, its a PITA for the average HTML user/developer, but being superficially
 simpler doesn't make it a valid long term solution. There is a standard in
 place for custom typed links re. link/.


The two are not exclusive. In an RDFa environment, I would suggest
using foaf:primaryTopic (note case too - too easy for developers to
mis-type)

Ian

Re: Linking HTML pages and data

2010-02-16 Thread Ian Davis

On Tue, Feb 16, 2010 at 7:42 PM, Ed Summers e...@pobox.com wrote:
 I also agree w/ Kingsley that it would be neat to also have a link
 pattern that non-RDFa folks could use:

  link rel=http://xmlns.com/foaf/0.1/primaryTopic;
 href=http://dbpedia.org/resource/Mogwai_(band) title=Mogwai /


I have been promoting the use of the simpler primarytopic rel value
as a pattern for linking HTML pages to the things they are about. I
don't think we need to complicate things with pseudo namespaces etc
for HTML, just focus on something simple people can copy.

You can see it in use on data.gov.uk:

http://education.data.gov.uk/doc/school/56

contains:

link rel=primarytopic href=http://education.data.gov.uk/id/school/56; /

Ian

Re: Fresnel: State of the Art?

2010-02-02 Thread Ian Davis

The Fresnel Path Language was submitted as a note to the W3C a while back:

http://www.w3.org/2005/04/fresnel-info/fsl/

I implemented that in PHP as part of the moriarty library:

http://code.google.com/p/moriarty/source/browse/trunk/graphpath.class.php

I think FSL is very interesting (having looked at many path languages
for RDF over the past 5 or 6 years) and I'd like to see more
implementations.

Ian



On Mon, Feb 1, 2010 at 1:44 PM, Aldo Bucchi aldo.buc...@gmail.com wrote:
 Hi,

 I was looking at the current JFresnel codebase and the project seems
 to have little movement. I was wondering if this is the state of the
 art regarding Declarative Presentation Knowledge for RDF or have
 efforts moved elsewhere and I have missed it?

 Thanks!
 A

 --
 Aldo Bucchi
 skype:aldo.bucchi
 http://www.univrz.com/
 http://aldobucchi.com/

 PRIVILEGED AND CONFIDENTIAL INFORMATION
 This message is only for the use of the individual or entity to which it is
 addressed and may contain information that is privileged and confidential. If
 you are not the intended recipient, please do not distribute or copy this
 communication, by e-mail or otherwise. Instead, please notify us immediately 
 by
 return e-mail.

Re: PHP RDF fetching code

2010-01-26 Thread Ian Davis

You may find something useful in my Moriarty project:

http://code.google.com/p/moriarty/

It's geared towards the Talis Platform but there is a lot of code in
there that has no dependencies on the platform, e.g.:

http://code.google.com/p/moriarty/source/browse/trunk/httprequest.class.php

some documentation for that class here:

http://code.google.com/p/moriarty/wiki/HttpRequest

Ian

Re: Contd: [pedantic-web] question about sioc / foaf usage

2009-11-30 Thread Ian Davis


 I assume you've noticed the dearth of RDF examples that include descriptions
 of RDF files that are distinct, but connected, to the file contents.

People have been doing that for years using foaf:primaryTopic. See
example at http://xmlns.com/foaf/spec/#term_PersonalProfileDocument
and substitute URIs for the nodeIDs

Ian

Re: Contd: [pedantic-web] question about sioc / foaf usage

2009-11-30 Thread Ian Davis

On Mon, Nov 30, 2009 at 10:37 PM, Kingsley Idehen
kide...@openlinksw.com wrote:


 If you lookup Linked Data from spaces associated with myself of OpenLink you
 will see use the aforementioned property re. missing relation. Also, you may
 also find out that few people added the missing triple to their RDF files
 after nudges from me.

 I hope I've made things clearer?

I've read this thread and I don't understand the fuss. Some people
aren't linking the document to the data it contains so we should
encourage them to. Don't know why that is characterised as a debacle.

Ian

Re: Contd: [pedantic-web] question about sioc / foaf usage

2009-11-30 Thread Ian Davis

On Tue, Dec 1, 2009 at 12:02 AM, Peter Ansell ansell.pe...@gmail.com wrote:
 The necessary declaration of document as distinct, and yet necessary
 for the definition of data, and the necessity of different URI's for
 these two concepts, are fundamental sticking points for many people.

Who is getting stuck on this point? Documents have URIs, as do the
things documents might contain data about.

 If the HTTP web no longer existed (or the internet connection was
 temporarily down), the discussion about document versus data would be
 mute. Simple RDF Triple database queries, that do not rely on HTTP
 communication, have no necessary need to refer to the
 Document/Artifact. Only data would exist in the RDF triples (unless
 you deliberately blur the division using the notion of foaf:Document
 via foaf:primaryTopic for instance). Hence the debacle with saying
 that Document is a necessary element to understand and use RDF data
 linked together using resolvable HTTP URI's when to many it is just an
 artifact that doesn't influence, and shouldn't need to semantically
 interfere with, the data/information content that is actually being
 referenced.

I disagree. Documents aren't HTTP artefacts: they exist happily on
disks, printouts and in books. You can identify the medium (the data
container in Kingsley's words) separately from the things it is
describing (the data items). In fact it is usually necessary to do,
and intuitive for most people who can distinguish the publisher of a
book from the protaganist it describes.


 In the long term, I see it as introducing a permanent link from a
 semantic RDF (or other similar format) universe to the current
 document segregated web that wouldn't be there if everyone shared
 their RDF information through some other system, and for example only
 used the URI verbatim to do queries on some global hashtable/index
 somewhere where there was no concept of document at the native RDF
 level. The definition of Linked Data doesn't specifically say that
 HTTP URI's have to be resolved using HTTP GET requests over TCP port
 80 using DNS for an intermediate host name lookup as necessary, so why
 should it require the notion of documents to be necessary containers
 for data pretty much just because that is how HTTP GET semantics work.

 I characterise it as a debacle because it has been a recurring
 discussion for many years and shows that the semantic communicty
 hasn't quite cleaned up its architecture/philosophy enough for it to
 be clear to people who are trying to understand it and utilise it
 without delving into philosophical debates.

It seems pretty clear to me and many others in my experience,
certainly not a debacle.


 Cheers,

 Peter


Ian

Showcase of Linked Data at Online Information

2009-09-23 Thread Ian Davis

Hi all,

I am delivering a Semantic Web track keynote at Online Information
this year( http://www.online-information.co.uk/ ) on the subject of
The Reality of Linked Data.

I want to showcase the work of the community in putting Linked Data to
work and bringing real benefit to end users. I am looking for examples
of live applications of the technology that go beyond toy or
theoretical examples. I'm not really looking for examples of datasets
unless they are demonstrating significant adoption outside of the core
community (i.e. not dbpedia but bestbuy and the bbc are significant)

If you are applying Linked Data please send me an outline of what you
are doing, links to where I can see it and a screenshot you would like
me to use (and are happy for me to put in the public domain as part of
my slide deck). I will endeavour to include as many as I can in my
talk.

Best regards,

Ian

Re: Top three levels of Dewey Decimal Classification published as linked data

2009-08-24 Thread Ian Davis

On Mon, Aug 24, 2009 at 2:56 PM, Ross Singerrossfsin...@gmail.com wrote:
 Anyway, yes, I think some more thought needs to go into Dewey and
 LCSH's relationship to the real world.

I think http://www.flickr.com/photos/danbri/3282565132 might be relevant here

The classification that danbri uses in that diagram is quite
interesting. I paraphrase them as: things, types, web documents (or
information resources) and conceptualizations. I'm not attempting to
define them at the moment.

I tried to enumerate how these four categories interelate:

things - things via general rdf properties
things - types via rdf:type
things - web documents via foaf:topic/foaf:isTopicOf/rdfs:seeAlso

web documents - types via rdf:type, maybe via foaf:topic if the
document is describing the type
web documents - conceptualizations via dc:subject
web documents - web documents via rdfs:seeAlso etc

types - types via rdfs:subClassOf

conceptualizations - conceptualizations via skos:broader/skos:narrower/etc.

A couple were missing:

For things - conceptualizations I recently created ov:category [1]
and ov:isCategoryOf [2] which I used in productdb.org to link things
with their categories (e.g. http://productdb.org/2006-honda-element).
Using dc:subject didn't seem right - does a model of car have a
subject? This is what I would suggest you use to relate an author to a
category about them.

The other one that is missing is types - conceptualization

SKOS says there is no defined relationship [3]. Interestingly the RDF
Semantics has this to say [4]: RDFS classes can be considered to be
rather more than simple sets; they can be thought of as
'classifications' or 'concepts' which have a robust notion of identity
which goes beyond a simple extensional correspondence. 


 -Ross.


Ian

[1] http://open.vocab.org/terms/category
[2] http://open.vocab.org/terms/isCategoryOf
[3] http://www.w3.org/TR/skos-reference/#L896
[4] http://www.w3.org/TR/rdf-mt/#technote

Re: Top three levels of Dewey Decimal Classification published as linked data

2009-08-20 Thread Ian Davis

On Wed, Aug 19, 2009 at 7:27 PM, Panzer,Michaelpanz...@oclc.org wrote:
 Hi all,

 I would like to announce the availability of the DDC Summaries as a linked
 data service that uses SKOS and other vocabularies for representation [1].
 Please take a look if you like. Comments, suggestions, and advice are really
 appreciated!


Very pleased to see this happen at OCLC and I hope there's more to come!

Ian

Re: ProductDB

2009-08-14 Thread Ian Davis

On Fri, Aug 14, 2009 at 10:47 AM, Toby Inkstert...@g5n.co.uk wrote:


 Does anyone know of any vocabs that provide terms like these? If not,
 shall I add to the VoCampBristol2009 todo list?

Maybe my first ever RDF schema would be useful:

http://vocab.org/barter/0.1/

Ian

Re: Distributed versioning for RDF?

2009-07-30 Thread Ian Davis

Have you looked into changesets which is used by the Talis Platform?

See http://n2.talis.com/wiki/Changesets and
http://vocab.org/changeset/schema.html

Ian

On Wed, Jul 29, 2009 at 2:37 PM, Axel Rauschmayera...@rauschma.de wrote:
 Offhand, I see the following requirements for many (mostly social) RDF
 applications:

 - text indexing
 - text diff for versioning
 - distributed versioning and synchronization.
 http://en.wikipedia.org/wiki/Distributed_version_control
 - provenance: author, data source (which might have named graphs)

 Open Anzo [1] and OpenLink Data Spaces [2] come pretty close, but, as far as
 I can tell, don't offer distributed versioning.

 Is there anything else out there that I might have missed?

 Thanks!

 Axel

 [1] http://www.openanzo.org/
 [2] http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/Ods

 --
 Axel Rauschmayer

 a...@rauschma.de
 http://www.pst.ifi.lmu.de/people/staff/rauschmayer/axel-rauschmayer/
 http://2ality.blogspot.com/
 http://hypergraphs.de/

Linked Data and the Public Domain

2009-07-16 Thread Ian Davis

I wrote up some background to licensing, waivers, the public domain and how
it applies to linked data. I also included examples of how you can declare a
public domain waiver for a linked dataset.

http://blogs.talis.com/nodalities/2009/07/linked-data-public-domain.php

Hope this helps people with these complex issues. In other news, our
tutorial Legal and Social Frameworks for Sharing Data on the  Web which
covers these kinds of issues was accepted for ISWC2009.  I am not a lawyer,
but we will have an expert lawyer taking part in that tutorial (see
http://www.jordanhatcher.com/ )

Ian

Re: [Ann] LinkedGeoData.org

2009-07-08 Thread Ian Davis

On Wednesday, July 8, 2009, Sören Auer a...@informatik.uni-leipzig.de wrote:
 Dear Colleagues,

 On behalf of the AKSW research group [1] I'm pleased to announce the first 
 public version of the LinkedGeoData.org datasets and services.

 LinkedGeoData is a comprehensive dataset derived from the OpenStreetMap 
 database covering RDF descriptions of more than 350 million spatial features 
 (i.e. nodes, ways, relations).

 LinkedGeoData currently comprises RDF dumps, Linked Data and REST interfaces, 
 links to DBpedia as well as a prototypical user interface for linked-geo-data 
 browsing and authoring.


Very nice. How long do you think it will take for the entire dataset
to be available?

 Open streetmap are voting soon on whether to adopt the open data
commons sharealike database license. If they adopt it will you also
adopt it for this data?



 Sören Auer


Ian

 [1] http://aksw.org

 --
 Sören Auer, AKSW/Computer Science Dept., University of Leipzig
 http://www.informatik.uni-leipzig.de/~auer,  Skype: soerenauer

Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation

2009-07-08 Thread Ian Davis

On Wednesday, July 8, 2009, Toby Inkster t...@g5n.co.uk wrote:
 On Wed, 2009-07-08 at 15:13 +0100, Mark Birbeck wrote:
 The original point of this thread seemed to me to be saying that if
 .htaccess is the key to the semantic web, then it's never going to
 happen.

 It simply isn't the key to the semantic web though.

 .htaccess is a simple way to configure Apache to do interesting things.
 It happens to give you a lot of power in deciding how requests for URLs
 should be translated into responses of data. If you have hosting which
 allows you such advanced control over your settings, and you can create
 nicer URLs, then by all means do so - and not just for RDF, but for all
 your URLs. It's a Good Thing to do, and in my opinion, worth switching
 hosts to achieve.

 But all that isn't necessary to publish linked data. If you own
 example.com, you can upload foaf.rdf and give yourself a URI like:

         http://example.com/foaf.rdf#alice

 (Or foaf.ttl, foaf.xhtml, whatever.)

This just works and is how the html web grew. Write a document and
save it into a publuc spaxe. Fancy stuff like pretty URIs need more
work but are not at all necessary for linked data or the semantic web.



 Let's not blow this all out of proportion.

Hear hear!

 --
 Toby A Inkster
 mailto:m...@tobyinkster.co.uk
 http://tobyinkster.co.uk

Re: LOD Data Sets, Licensing, and AWS

2009-06-24 Thread Ian Davis

On Wed, Jun 24, 2009 at 4:05 PM, Kingsley Idehen kide...@openlinksw.comwrote:

 My comments are still fundamentally about my preference for CC-BY-SA.
  Hence the transcopyright reference :-)

 I want Linked Data to have its GPL equivalent; a license scheme that:


Have you read the licenses at http://opendatacommons.org/ ?


 Ian

The Public Domain (was Re: LOD Data Sets, Licensing, and AWS)

2009-06-24 Thread Ian Davis

On Wed, Jun 24, 2009 at 9:56 PM, Kingsley Idehen kide...@openlinksw.comwrote:

 The NYT, London Times, and others of this ilk, are more likely to
 contribute their quality data to the LOD cloud if they know there is a
 vehicle (e.g., a license scheme) that ensures their HTTP URIs are protected
 i.e., always accessible to user agents at the data representation (HTML,
 XML, N3, RDF/XML, Turtle etc..) level; thereby ensuring citation and
 attribution requirements are honored.


I agree with that, but it only covers a small portion of what is needed. You
fail to consider the situations where people publish data about other
people's URIs, as reviews or annotation. The foaf:primaryTopic mechanism
isn't strong enough if the publisher requires full attribution for use of
their data. If I use SPARQL to extract a subset of reviews to display on my
site then in all likelihood I have lost that linkage with the publishing
document.



 Attribution is the kind of thing one gives as the result of a license
 requirement in exchange for permission to copy. In the academic world for
 journal articles this doesn't come into play at all, since there is no
 copying (in the usual case). Instead people cite articles because the norms
 of their community demand it.

Yes, and the HTTP URI ultimately delivers the kind mechanism I believe most
 traditional media companies seek (as stated above). They ultimately want
 people to use their data with low cost citation and attribution intrinsic to
 the medium of value exchange.


The BBC is a traditional media company. Its data is licensed only for
personal, non-commercial use: http://www.bbc.co.uk/terms/#3


 btw - how are you dealing with this matter re. the nuerocommons.org linked
 data space? How do you ensure your valuable work is fully credited as it
 bubbles up the value chain?


I found this linked from the RDF Distribution page on neurocommons.org :
http://svn.neurocommons.org/svn/trunk/product/bundles/frontend/nsparql/NOTICES.txt

Everyone should read it right now to appreciate the complexity of
aggregating data from many sources when they all have idiosyncratic
requirements of attribution.

Then read
http://sciencecommons.org/projects/publishing/open-access-data-protocol/ to
see how we should be approaching the licensing of data. It explains in
detail the motivations for things like CC-0 and PDDL which seek to promote
open access for all by removing restrictions:

Thus, to facilitate data integration and open access data sharing, any
implementation of this protocol MUST waive all rights necessary for data
extraction and re-use (including copyright, sui generis database rights,
claims of unfair competition, implied contracts, and other legal rights),
and MUST NOT apply any obligations on the user of the data or database such
as “copyleft” or “share alike”, or even the legal requirement to provide
attribution. Any implementation SHOULD define a non-legally binding set of
citation norms in clear, lay-readable language.

Science Commons have spent a lot of time and resources to come to this
conclusion, and they tried all kinds of alternatives such as attribution and
share alike licences (as did Talis). The final consensus was that the public
domain was the only mechanism that could scale for the future. Without this
kind of approach, aggregating, querying and reusing the web of data will
become impossibly complex. This is a key motivation for Talis starting the
Connected Commons programme ( http://www.talis.com/platform/cc/ ). We want
to see more data that is unambiguously reusable because it has been placed
in the public domain using CC-0 or the Open Data Commons PDDL.

So, I urge everyone publishing data onto the linked data web to consider
waiving all rights over it using one of the licenses above. As Kingsley
points out, you will always be attributed via the URIs you mint.

Ian

PS. This was the subject of my keynote at code4lib 2009 If you love
something, set it free, which you can view here
http://www.slideshare.net/iandavis/code4lib2009-keynote-1073812

Re: LOD Data Sets, Licensing, and AWS

2009-06-23 Thread Ian Davis

Hi all,

On Tue, Jun 23, 2009 at 9:36 PM, Kingsley Idehen kide...@openlinksw.comwrote:

 All,

 As you may have noticed, AWS still haven't made the LOD cloud data sets  --
 that I submitted eons ago -- public. Basically, the hold-up comes down to
 discomfort with the lack of license clarity re. some of the data sets.

 Action items for all data set publishers:

 1. Integrate your data set licensing into your data set (for LOD I would
 expect CC-BY-SA to be the norm)


Please do not use CC-BY-SA for LOD - it is not an appropriate licence and it
is making the problem worse. That licence uses copyright which does not hold
for factual information.

Please use an Open Data Commons license or CC-0

http://www.opendatacommons.org/licenses/

http://wiki.creativecommons.org/CC0

If your dataset contains copyrighted material too (e.g. reviews) and you
hold the rights over that content then you should also apply a standard
copyright licence. So for completeness you need a licence for your data and
one for your content. If you use CC-0 you can apply it to both at the same
time. Obviously if you aren't the rightsholder (e.g. it is scraped
data/content from someone else) then you can't just slap any licence you
like on it - you have to abide by the original rightsholder's wishes.

Personally I would try and select a public domain waiver or dedication, not
one that requires attributon. The reason can be seen at
http://en.wikipedia.org/wiki/BSD_license#UC_Berkeley_advertising_clausewhere
stacking of attributions becomes a huge burden. Having datasets
require attribution will negate one of the linked data web's greatest
strengths: the simplicity of remixing and reusing data.

A group of us have submitted a tutorial on these issues for ISWC 2009,
hopefully it will get accepted because this is a really important area of
Linked Data that is poorly understood.



 2. Indicate license terms in the appropriate column at:
 http://esw.w3.org/topic/DataSetRDFDumps

 If licenses aren't clear I will have to exclude offending data sets from
 the AWS publication effort.


I completely support declaring what rights are asserted or waived for a
dataset, so please everyone help this effort.

Ian

Re: http://ld2sd.deri.org/lod-ng-tutorial/

2009-06-23 Thread Ian Davis

On Tue, Jun 23, 2009 at 8:01 AM, Giovanni Tummarello g.tummare...@gmail.com
 wrote:

 Just a remark about what we're doing in Sindice, for all who want to
 be indexed properly by us.

 we recursively dereference the properties that are used thus trying to
 obtain a closure over the description of the properties that are used.
 We also consider OWL imports.

 When the recursive fetching is computer, we apply RDFS + some owl
 reasoning (OWLIM being the final reasoner at the moment) and index it.


Just out of interest, if you detect an inconsistency do you still index it?

Ian

Re: http://ld2sd.deri.org/lod-ng-tutorial/

2009-06-23 Thread Ian Davis

On Tue, Jun 23, 2009 at 10:12 AM, Dan Brickley dan...@danbri.org wrote:

 On 23/6/09 11:01, Martin Hepp (UniBW) wrote:

  And Michael, please be frank - there is a tendency in the LOD community
 which goes along the lines of OWL and DL-minded SW research has proven
 obsolete anyway, so we LOD guys and girls just pick and use the bits and
 pieces we like and don't care about the rest.


  What made the Web so powerful is that its Architecture is extremely
 well-thought underneath the first cover of simplicity.


 One of those principles is partial understanding - the ability to do
 something useful without understanding everything...


Absolutely.

We should also remember that multiple ontologies may exist that cover a
given term. I think this is often forgotten. There is no requirement that
the ontology statements retrieved by dereferencing the URI should be used -
they are only provided as _an_ additional source of information. There may
be many other ways to discover relevant ontologies and a large class of
those will be for private use. If I choose to assert that dc:date and
rev:createdOn are owl:equivalentProperties then that is my prerogative. The
beauty of the semweb is that I can publish my assertions and potentially
other people could choose to adopt them.

Ian

Re: LOD Data Sets, Licensing, and AWS

2009-06-23 Thread Ian Davis


 On Tue, Jun 23, 2009 at 11:11 PM, Kingsley Idehen 
 kide...@openlinksw.comwrote:


 Using licensing to ensure the data providers URIs are always preserved
 delivers low cost and implicit attribution. This is what I believe CC-BY-SA
 delivers. There is nothing wrong with granular attribution if compliance is
 low cost. Personally, I think we are on the verge of an Attribution
 Economy, and said economy will encourage contributions from a plethora of
 high quality data providers (esp. from the tradition media realm).


Regardless of any attribution economy, CC-BY-SA is basically unenforceable
for data so is not appropriate. You can't copyright the diameter of the
moon.

Ian

Re: LOD Data Sets, Licensing, and AWS

2009-06-23 Thread Ian Davis

On Wednesday, June 24, 2009, Peter Ansell ansell.pe...@gmail.com wrote:


 2009/6/24 Ian Davis li...@iandavis.com

 On Tue, Jun 23, 2009 at 11:11 PM, Kingsley Idehen kide...@openlinksw.com 
 wrote:


 Using licensing to ensure the data providers URIs are always preserved 
 delivers low cost and implicit attribution. This is what I believe CC-BY-SA 
 delivers. There is nothing wrong with granular attribution if compliance is 
 low cost. Personally, I think we are on the verge of an Attribution 
 Economy, and said economy will encourage contributions from a plethora of 
 high quality data providers (esp. from the tradition media realm).


 Regardless of any attribution economy, CC-BY-SA is basically unenforceable 
 for data so is not appropriate. You can't copyright the diameter of the moon.

 Ian


 Interestingly, there is a large economy involved with patenting gene 
 sequences. Aren't they facts also? Why is patenting different to copyright in 
 this respect?


I can't explain the technicalities (IANAL) but there are many
different types of property rights that are granted by governments
over information : copyright, database right, patent right, moral
right etc. Each of those have seperate legislation that varies by
jurisdiction (WIPO is attempting to normalising some of them). It's
complicated which is why the efforts of creative commons, science
commons and open data commons are so valuable: they create simple ways
for people to declare the conditions under which their data and
content can be reused.

Ian

Re: gimmee some data!

2009-06-15 Thread Ian Davis

On Mon, Jun 15, 2009 at 10:14 AM, Toby Inkster t...@g5n.co.uk wrote:

 On Mon, 2009-06-15 at 01:03 +0100, Hugh Glaser wrote:
  On 15/06/2009 00:18, Toby A Inkster t...@g5n.co.uk wrote:
 
   I still need to add some 303 redirects in there.
 
  Better hurry up, people might find it...
  http://sameas.org/html?uri=http://ontologi.es/place/GB-WAR

 Cool! Thanks you very much. I hope this is the start of a trend and that we
all get birthday data!  :)



 It was late, so I was cutting corners, but it's tidied up now, plus an
 index with voiD description has been added at http://ontologi.es/place/

 Aside: http://unlocode.rkbexplorer.com/id/GBAFT should have label
 Alfriston (with an l) - perhaps an error in the source data?


I noticed a slight encoding error on
http://ontologi.es/place/http://ontologi.es/place/GB-WAR

I think you have unescaped ampersands in there.

Cheers,

Ian

Re: gimmee some data!

2009-06-14 Thread Ian Davis

Wow! Thank you, I'm reallly speechless. Best birthday present ever :)

On Sunday, June 14, 2009, Hugh Glaser h...@ecs.soton.ac.uk wrote:
 What a fun idea I thought - more fool me?
 We should be able to do that pretty easily, shouldn't we?
 So I went and looked in the reference section of Project Gutenberg, and
 chose a book
 (A Short Biographical Dictionary of English Literature by John W. Cousin).
 However, several hours later, I am not pleased with the result, but really
 need to get back to my marking (yes, it was a bit of a displacement activity
 :-) ), and it may be enough for someone else to polish.
 Anyway, Happy Birthday!, a brand new present:

 http://biolit.rkbexplorer.com/

 Hugh


 On 14/06/2009 10:23, Danny Ayers danny.ay...@gmail.com wrote:

 It's Ian Davis' birthday tomorrow, and for it he wants some linked data.

 So what datasets does anyone know of that can be translated relatively
 quick  easy, the stuff you are planning to do one day when you get
 time..?



 --
 http://danny.ayers.name

Re: Common Tag - semantic tagging convention

2009-06-11 Thread Ian Davis

Congratulations! This looks really good

On Thursday, June 11, 2009, Andraz Tori and...@zemanta.com wrote:
 Hi guys,

 today, a small consortium of web companies and one institute
 (AdaptiveBlue, DERI (NUI Galway), Faviki, Freebase, Yahoo!, Zemanta, and
 Zigtag) released a format specifying expression of semantic tags so our
 tools will understand/publish them.

 http://commontag.org

 It is RDFa based and it does not mandate specific vocabulary of meanings
 for tags. DBpedia and Freebase are currently used by Zigtag, Faviki and
 Zemanta.

 Soon Glue browsing extension will support it too - so there's going to
 be improved browsing experience as a bonus for semantically tagging the
 content. We tried to build on previous experience of MOAT and others and
 simplify even further, so the barrier to entry will be low as possible
 (but still RDFa is not grokked by a lot of publishing platforms).

 I am interested on your thoughts in this!

 And if anyone wants to use this somewhere please report it, so we'll put
 it under Applications page at http://commontag.org

 [i'll be traveling in next few days and won't be able to answer emails
 promptly, but there are guys from other organizations this list that
 will]


 --
 Andraz Tori, CTO
 Zemanta Ltd, New York, London, Ljubljana
 www.zemanta.com
 mail: and...@zemanta.com
 tel: +386 41 515 767
 twitter: andraz, skype: minmax_test

Re: Linked Data upcoming Semantic Technologies 2009 Conference.

2009-06-10 Thread Ian Davis

I'd like to attend but that wiki page appears to be locked for edits, so I
can't add myself. Can you add me?

On Tue, Jun 9, 2009 at 6:21 PM, Kingsley Idehen kide...@openlinksw.comwrote:

 All,

 If you're attending the Semantic Web Technologies 2009 Conference in San
 Jose, note that there will be a Linked Data meetup [1].

 I also have a 30 minute slot covering the use of Linked Data to solve real
 problems [2], I hope to use Linked Data to expand the 30 minute window :-)

 I am also part of a Linked Data discussion panel [3] (moderated by: Paul
 Miller) that includes Leigh Dodds, Jamie Taylor, and others.

 Links:
 1.
 http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData/SanJoseGathering
 2. http://semtech2009.com/session/2012/
 3. http://semtech2009.com/session/1988/

 --


 Regards,

 Kingsley Idehen   Weblog: 
 http://www.openlinksw.com/blog/~kidehenhttp://www.openlinksw.com/blog/%7Ekidehen
 President  CEO OpenLink Software Web: http://www.openlinksw.com

Re: New github project for RDFizer scripts

2009-05-21 Thread Ian Davis

On Thu, May 21, 2009 at 7:53 PM, Kingsley Idehen kide...@openlinksw.comwrote:


 All,

 The 30+ xslt stylesheets [1]used by the our collection Sponger
 Cartridges are now available for community development and enhancement
 via a github [2].

 Links:

 1.

 http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/ClickableVirtSpongerCloud
 2. http://tr.im/m0PT



Very nice :)

I hope people start to feed back and make them even more useful

Ian

Re: [ANN] Linking Open Data Triplification Challenge 2009

2009-05-16 Thread Ian Davis

On Sat, May 16, 2009 at 10:03 AM, Michael Hausenblas 
michael.hausenb...@deri.org wrote:


 With the recent uptake of structured data/RDF by major players such as
 Google the motivation for exposing relational data and other structured
 data
 sources on the Web entered a new stage. We encourage participants to
 publish
 existing structured (relational) representations, which are already backing
 most of the existing Web sites and demonstrate useful and usable
 applications on top of it.


Entrants for the competition might find the Talis Connected Commons scheme
useful. It provides free hosting and services such as full text search,
faceting and sparql for public domain datasets up to 50 million triples.

See http://www.talis.com/platform/cc/ for details

Ian

Commercial Product Announcements

2008-12-08 Thread Ian Davis

Hi all,

I'm not aware of any official policy on commercial posts to this list, but
usual mailing list etiquette generally recommends that postings about
commercial services and products should indicate their nature and give some
indication of cost. I think this is even more important when the mailing
list is focussed on a project with an open or free theme. Obviously, as
I represent a vendor my preference is to allow commercial postings but I
would also prefer that they are clearly labelled as such. Is this a policy
that other members would like to see made explicit?

Ian

Re: Domain and range are useful Re: DBpedia 3.2 release, including DBpedia Ontology and RDF links to Freebase

2008-11-17 Thread Ian Davis

On Tue, Nov 18, 2008 at 4:02 AM, Tim Berners-Lee [EMAIL PROTECTED] wrote:


 On 2008-11 -17, at 11:27, John Goodwin wrote:

 [...]
 I'd be tempted to generalise or just remove the domain/range
 restrictions. Any thoughts?


 There are lots of uses for rand and domain.
 One is in the user interface -- if you for example link a a person and a
 document, the system
 can prompt you for a relationship which will include is author of and
 made but won't include foaf:knows or is issue of.

 Similarly, when making a friend, one can us autocompletion on labels which
 the current session knows about and simplify it by for example removing all
 documents from a list of candidate foaf:knows friends.


Both these use cases require some OWL to say that documents aren't people. I
don't see these scenarios being feasible in the general case because you'd
need a complete description of the world in OWL, i.e. you'd want to know
about everything that can't possibly be a person.




 It is of course also important for checking hand-written files for
 validity.


Again, isn't validity checking something that can only be done with OWL.
RDFS only adds for information.




 Tim BL


Ian

Re: New LOD Cloud - Please send us links to missing data sources

2008-09-19 Thread Ian Davis

I wonder if we could highlight those doing a great job in this space more,
e.g. I believe Opera's  foaf output is LOD

On Fri, Sep 19, 2008 at 1:45 PM, Tom Heath [EMAIL PROTECTED] wrote:

 Sad but true. Things are improving in my experience but we still have
 some evangelism to do in this area.

 On 19/09/2008, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
  but by that token you could probably wipe out most of foaf and doap
  space from the diagram
 
  Most of that data is not very linky and many primary resources being
  described don't have uris
 
  On 9/19/08, Tom Heath [EMAIL PROTECTED] wrote:
 
  Hey Mischa,
 
  Good to hear you :)
 
  Just to add to what Peter said, last time I checked LiveJournal was
  not very Linked Data-friendly, which is a shame, naturally, as they
  were well ahead of the curve with the FOAF export.
 
  Cheers,
 
  Tom.
 
 
 
  2008/9/19 Peter Ansell [EMAIL PROTECTED]:
  - [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 
  From: [EMAIL PROTECTED] [EMAIL PROTECTED]
  To: public-lod@w3.org
  Sent: Friday, September 19, 2008 1:55:07 AM GMT +10:00 Brisbane
  Subject: Re: New LOD Cloud - Please send us links to missing data
  sources
 
  Hello,
 
 
  There doesnt seem to be any mention of the LiveJournal or any of the
  livejournal powered blogging sites, such as: vox, friendfeed, hi5 to
  name a few.
 
  I think they are implicitly in the FOAF cloud, for want of a better
  description of that node ;)
 
  Cheers,
 
  Peter
 
  Find out more about Talis at  www.talis.com
  Shared InnovationTM
 
 
  Any views or personal opinions expressed within this email may not be
  those of Talis Information Ltd. The content of this email message and
 any
  files that may be attached are confidential, and for the usage of the
  intended recipient only. If you are not the intended recipient, then
  please return this message to the sender and delete it. Any use of this
  e-mail by an unauthorised recipient is prohibited.
 
 
  Talis Information Ltd is a member of the Talis Group of companies and
 is
  registered in England No 3638278 with its registered office at Knights
  Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
 
  __
  This email has been scanned by the MessageLabs Email Security System.
  For more information please visit http://www.messagelabs.com/email
  __
 
 
 
 
  Find out more about Talis at  www.talis.com
  Shared InnovationTM
 
 
  Any views or personal opinions expressed within this email may not be
 those
  of Talis Information Ltd. The content of this email message and any files
  that may be attached are confidential, and for the usage of the intended
  recipient only. If you are not the intended recipient, then please return
  this message to the sender and delete it. Any use of this e-mail by an
  unauthorised recipient is prohibited.
 
 
  Talis Information Ltd is a member of the Talis Group of companies and is
  registered in England No 3638278 with its registered office at Knights
  Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
 
  __
  This email has been scanned by the MessageLabs Email Security System.
  For more information please visit http://www.messagelabs.com/email
  __

Job Advert: Linked Data Developer

2008-09-12 Thread Ian Davis

== Linked Data Developer ==

Our platform development group is responsible for making sure that the
Talis Platform is the premier environment for developing and delivering
great Semantic Web applications. We need your help to convert and
generate Linked Data sets for use by these and other applications.

We're looking for people who:

* use their code to communicate their ideas clearly
* are experts in scripting and automating data conversions
* have been involved in open source or community projects
* can devise new strategies for processing data at scale
* never forget about scalability, performance and security
* prefer to develop test first
* are proficient at modeling data in RDF
* have an opinion on httpRange-14
* aren't afraid to ask questions
* like to say let's try it and we can do that
* understand how to balance perfection with reality
* are as happy to lead as to follow
* know when to reuse and when to start afresh
* can tell us about something new they learned this year

== How to apply ==

Take a look at the problems below and select two to answer. Please send
us your C.V and an application letter telling us about yourself and
including your answers to [EMAIL PROTECTED] Talis is based in
Birmingham, UK but this role is amenable to remote working so we
welcome non-UK applicants.

1. Describe how you would model the cast list of a movie as Linked
Data. How would your approach cope if one cast member wanted higher
billing on movie posters? Discuss the trade-offs involved in modeling
lists and collections in RDF paying particular attention to how the
data can be optimized for wide reuse.

2. Give an example of how you have automated long running processes in
the past and some of the issues you encountered. What strategies would
you adopt for coping with failures in the process and its environment?

3. Discuss your approach to producing Linked Data in domains that have
no pre-existing consensus on the model or vocabulary. How would you
ensure the data is immediately usable by as many interested parties as
possible while retaining the ability to evolve the model? Additionally,
how would you then build consensus in the community around the chosen
model?

98 matches

Mail list logo