Re: Computer science publisher needs help with RDFa/HTTP technical issue [Re: How are RDFa clients expected to handle 301 Moved Permanently?]

2013-10-25 Thread Nathan Rixham
It's simpler than that and there are two quite simple issues.

1) They have said they have changed from /Vol-1010/ to /Vol-1010 when
they have not - as the 301 Moved Permanently to /Vol-1010/
illustrates, if they had moved URIs it would be the other way around,
/Vol-1010/ would 301 to /Vol-1010.

2) Difference between web browser and rdfa base URI calculation and
ambiguity of not being specific have compounded and confused the issue
further.

To address the situation they can just be specific, set the base of
the document to be either http://ceur-ws.org/Vol-1010/ or
http://ceur-ws.org/Vol-1010, if they set it to be the variant with the
trailing slash, they'll find both HTML and RDFa are correct, if they
set it to be variant without the trailing slash they'll find both the
HTML and the RDFa have incorrect links.

Separately, it does raise the question as to why uriburner and pyrdfa
both use the input URI http://ceur-ws.org/Vol-1010 as base rather than
the one instructed by the HTTP 301 redirect, namely
http://ceur-ws.org/Vol-1010/ - perhaps this is an issue, or perhaps it
should be left as is to encourage the good practise of explicitly
saying what you mean.

Best, Nathan

On Fri, Oct 25, 2013 at 5:44 PM, Kingsley Idehen  wrote:
> On 10/25/13 12:03 PM, Christoph LANGE wrote:
>>
>> it seems the RDFa mailing list is not that active any more, as I haven't
>> got an answer for this question for two weeks.  As my question is also
>> related to LOD publishing, let me try to ask it here.  We, the
>> publishers of CEUR-WS.org, are facing a technical issue involving RDFa
>> and hash vs. slash URIs/URLs.
>
>
> What is your problem re. entity denotation?
>
> Simple rule of thumb:
>
> 1. Denote documents using URLs
> 2. Denote every other kind of entity using hash (as "#") based HTTP URIs.
>
> If "#" based HTTP URIs pose deployment problems, then you can consider using
> "/" based HTTP URIs, but you then have to take look to one of the following
> issues that require tweaks to your data server:
>
> 1. Use the Path component (part) of your HTTP URIs to set up regular
> expression pattern-friendly markers that distinguish HTTP URIs that denote
> documents from those that denote every other type of entity -- basically,
> this is what you see re. "/page/" (for description documents) and
> "/resource/" (for every other entity type/kind) re., DBpedia
>
> 2. Use 303 to redirect entity URIs to the document URLs that denote their
> descriptors (description documents).
>
> If using 303 redirection presents deployment challenges, bearing in mind
> latest revisions to HTTP, you can use a 200 OK instead of a 303, but you
> MUST place the URL of the entity descriptor (description document) in the
> "Location: " header of your HTTP responses i.e., use HTTP response metadata
> to handle the ambiguity that "/" based HTTP URIs present.
>
> In my experience with RDFa, I've found it easiest to deploy using relative
> hash based HTTP URIs.
>
> Links:
>
> [1] http://bit.ly/15tk1Au -- hash based Linked Data URI illustrated
> [2] http://bit.ly/11xnQ36 -- hashless or slash based Linked Data URI
> illustrated
>
> --
>
> Regards,
>
> Kingsley Idehen
> Founder & CEO
> OpenLink Software
> Company Web: http://www.openlinksw.com
> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
> Twitter/Identi.ca handle: @kidehen
> Google+ Profile: https://plus.google.com/112399767740508618350/about
> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>
>
>
>
>



Re: What Does Point Number 3 of TimBL's Linked Data Mean?

2013-06-22 Thread Nathan Rixham
What it means now, or at any point in time, must be inclusive to new
in-development or in-use things, other wise it will never mean anything
else later down the line.

If you want it to mean a very specific set of things at any one time, then
take "Linked Data" down the standardization path and give it fixed versions
which are RECs.

I don't see anybody saying "don't use RDF" or "RDF is a bad idea for Linked
Data, use Y instead". I just see some people inferring that RDF precludes
all other things, and other people saying why should it preclude everything
else?

ps: I'd be very wary about saying that any web tech is ".. *the* universal
..", many of them are Uniform, non are truly universal, even within their
specific domains, and any such claims will always be disagreed with by
somebody as they are always untrue claims and alternatives always exist.

Best,

Nathan


On Sat, Jun 22, 2013 at 4:42 AM, David Booth  wrote:

> On 06/21/2013 07:03 PM, Nathan Rixham wrote:
>
>> Linked Data is a moving target, it's not Linked Data 1.0, 1.1, 2.0 etc,
>> it's a set of technologies which make it easy to have machine readable
>> data that is interlinked on the web.
>>
>
> Okay, but that is kind of like saying that the Web is a moving target
> because of the many technologies that come and go.  While that may be true,
> those technologies are not key to what makes the Web the Web. Certain key
> technologies are foundational and change little if at all: namely the use
> of URIs as the standard universal identification scheme.
>
> Similarly, although query languages like SPARQL and formats like Turtle,
> RDF/XML and JSON-LD may come and go, those technologies are *not* what
> makes the Semantic Web the Semantic Web.  RDF is *key* to making Semantic
> Web data easily machine interpretable and combinable, because it is *the*
> universal data model on which the Semantic Web is based.  It could evolve
> slowly, just has URIs are slowly evolving to permit IRIs, but and it could
> eventually be supplanted by a new standard universal data model.  But for
> now and the foreseeable future it is the standard universal data model for
> the Semantic Web.
>
>
>
>> If Linked Data is built on HTTP currently, then the media types used
>> have to be registered, which limits the set, but this set of supported
>> mediatypes can and will change over time, as will the protocols used, as
>> will the ontologies and the data, and so forth.
>>
>> You can't lock it in stone, or preclude innovation and new
>> specifications, common sense and basic web architecture entail using
>> URIs/IRIs, common protocols (HTTP), registered media types, and so
>> forth, but if a large eco system of data in a new media type is
>> developed or an older one bootstrapped and commonly supported, it's
>> going to be Linked Data.
>>
>> Interoperability, modularity, and, tolerance - they're all critical, and
>> none of them entail forever using only RDF and SPARQL
>>
>
> Forever is a long time.  Certainly the foundations of the Web and the
> Semantic Web could be re-architected or supplanted eventually.  But there
> is a vast difference between using a new a media type, a new query language
> or even a new protocol, and using a new identification scheme (URIs) or a
> new universal data model (RDF).
>
> For the foreseeable future, RDF is *essential* to the Semantic Web because
> the Semantic Web relies on having a standard universal data model, just as
> URIs are *essential* to the Web because the Web relies on having a standard
> universal identification scheme.
>
> Therefore, if you believe that Linked Data is intended to support the
> goals of the Semantic Web, or if you believe that Linked Data is "the
> Semantic Web done right", then for the foreseeable future RDF is *required*
> for Linked Data (though the data does not have to *look* overtly like RDF).
>
> We're talking about what the term "Linked Data" means *now* -- not what it
> might mean in 10, 20 or 50 years.
>
> David
>


Re: What Does Point Number 3 of TimBL's Linked Data Mean?

2013-06-21 Thread Nathan Rixham
Linked Data is a moving target, it's not Linked Data 1.0, 1.1, 2.0 etc,
it's a set of technologies which make it easy to have machine readable data
that is interlinked on the web.

If Linked Data is built on HTTP currently, then the media types used have
to be registered, which limits the set, but this set of supported
mediatypes can and will change over time, as will the protocols used, as
will the ontologies and the data, and so forth.

You can't lock it in stone, or preclude innovation and new specifications,
common sense and basic web architecture entail using URIs/IRIs, common
protocols (HTTP), registered media types, and so forth, but if a large eco
system of data in a new media type is developed or an older one
bootstrapped and commonly supported, it's going to be Linked Data.

Interoperability, modularity, and, tolerance - they're all critical, and
none of them entail forever using only RDF and SPARQL


On Fri, Jun 21, 2013 at 11:38 PM, Stephane Fellah wrote:

>
>
>
> On Fri, Jun 21, 2013 at 5:49 PM, Kingsley Idehen 
> wrote:
>
>>  On 6/21/13 3:25 PM, Stephane Fellah wrote:
>>
>> +1 David.
>>
>>  It is clear that interoperability of any system is enabled by a set of
>> widely adopted standards (similar to TCP/IP for internet, HTTP/URI for the
>> Web).  TBL clearly indicated in his revised document that the standards for
>> Linked Data are URI, HTTP,  RDF and SPARQL for the query language. I am not
>> going to argue with this, like I am not going to argue that HTTP is the
>> protocol for hypertext. You may argue that the specs are imperfect, but
>> they are truly a solid foundation for SW architecture. The specs can be
>> revised and improved other time (such HTTP 1.0,HTTP 1.1, SPARQL 1.1, RDF
>> 1.1, OWL 2.0).
>>
>>  While the writing is TBL's personal opinion, RDF and SPARQL are W3C
>> standards. Introducing other standards would break interoperability of the
>> system. This would be my last intervention on this subject, as I think I
>> explain enough my position. I just do not have the energy and time to keep
>> arguing about this topic,as it brings nothing new on the table to improve
>> the goal of SW.
>>
>>
>> What part of the excerpt below (from my opening post of this thread)
>> contradicts the fact that SPARQL and RDF are W3C standards?
>>
>
> I just said they are the standards for Linked Data. You want to call it
> implementation details. This is misleading because you imply that it is OK
> to use other standards. I think that I differ we you. It is not a detail.
> It is the standard so you leverage all the technologies and tools developed
> on this foundation.
>
>
>
>>  What do I mean by RDF and SPARQL are Linked Data implementation details?
>>
>> I said:
>>
>> They (RDF and SPARQL) are W3C standards that aid the process of building
>> Linked Data (as outlined in the *TimBL's revised meme*). That said, it
>> doesn't mean that you cannot take other paths to Linked Data while
>> remaining 100% compliant with the essence of *TimBL's original Linked Data
>> meme*.
>>
>>
> Let me make an analogy of the current discussion:
>
> The *Open Systems Interconnection (OSI) model i*s a conceptual 
> model that
> characterizes and standardizes the internal functions of a communications
> system  by
> partitioning it into abstraction 
> layers.
> This model is used to built the Internet.
>
> Now you come and say:
>
> * TCP/IP is an implementation details of the Internet of the OSI stack.
>  We do not need to use TCP/IP to make Internet work, which is true (UDP is
> an alternative protocol for example).
>
> What happens if you use something else than TCP/IP today ? You will build
> your own implementation of Internet and you will find yourself pretty
> isolated because you have no way to interoperate with the widely used
> TCP/IP based Internet.  You will have to start from scratch and rebuild all
> the set of tools and technologies to leverage your new standards. You
> fracture the internet into silos.  What did you accomplish by introducing a
> new implementation detail, except saying : Hey look at my awesome internet
> implementation that does the same thing that the Internet. If you want to
> use it, you have to buy/use all my technology stack ?  Guess what would be
> my answer ? Good luck to get your proprietary system widely adopted...
>
> To avoid fracture, you have to agree on widely adopted OPEN standards. By
> using OPEN standards, people can built something useful on  stable
> foundation on which there is no commercial interest of any kind.  RDF is a
> W3C OPEN standard and is widely used today by developers dealing with
> Linked Data. There are today a lot of tools available built on these
> standards. There is no good incentive to provide an alternative to RDF
> model. I cannot see any better and simpler model than the triple model
> based on URIs. 

Re: Ending the Linked Data debate -- PLEASE VOTE *NOW*!

2013-06-13 Thread Nathan

David Booth wrote:

A heated debate has been raging about the accepted meaning
of the term "Linked Data" in the context of the Semantic
Web community -- whether or not this term implies the use
of RDF.  


Pointless, or are you going to do trademark the term and sue anybody who 
uses it to refer to anything other than RDF?


It boils down to some people saying "we made/use this, use it too" and 
others saying "we'd like to use this too, it does the same job and is 
compatible", and Kingsley and some others saying "we can integrate it 
all why worry".


Axioms of tolerance, and modular design apply here:
a) tolerate other over the wires types
b) RDF is part of a modular design, namely linked data and the semantic web.

So RDF may spring to mind when you say Linked Data, given it's so 
prevalent, but Linked Data refers to a set of things, RDF is just one of 
them.







Re: Linked Data Dogfood circa. 2013

2013-01-08 Thread Nathan

Hugh Glaser wrote:

Please name applications!
Go on, you must be able to name one to support your view.


That's a fair, but also unfair, question to ask.

Most apps can be categorized in to two categories:

  a) those that are for a silo and pull data from that silo.
  b) those that pull data from multiple sources.

We must consider both separately.

Regarding a) Many of the big data silos use semantic web techs and 
linked data, and thus their applications use it indirectly - they may 
even use it directly, but who'd know and what value would that be?


Regarding b) The apps which pull data from multiple sources invariably 
do it via a server side application which cleans, analyses and merges 
the data (feedly, fliboard, currents etc), or deal with specific media 
types like images (500px for example). Again, those server side 
applications often use semantic web techs and linked data, and thus 
their applications also use it indirectly.


The primary problem is, that for any application to be popular they need 
to provide a good reliable user experience, and that requires that they 
have dependable clean data, and that usually requires that they 
integrate and clean the data before sending it to the app which the end 
user is using.


There are only really two viable approaches to providing a good user 
experience with data from the wild:

  1) Clean the data first
  2) Use data which has a simple dependable structure (jpg, atom, rss etc)

"Linked Data" comes under the bracket of (1) often already, but that 
won't count in a definition of a "Linked Data Application", and it 
doesn't currently come under the bracket of (2) other than the RSS case, 
which many would discount, because it can use any old ontology from 
anywhere on the web, rather than a rigid well known schema.


Thus, I fear your question is fair, but ultimately if that's your 
measurement of how well linked data is doing, you'll always have a very 
negative view of it, thus unfair.


Hope that helps the discussion a bit,

Nathan



Re: Expensive links in Linked Data

2012-09-28 Thread Nathan

SERVANT Francois-Paul wrote:

Hi,

How do you include links to results of computations in Linked Data?

For instance, you publish data about entities of a given class. A property, 
let's call it :expensiveProp, has this class as domain, and you know that 
computing or publishing the corresponding triples is expensive. In such a case, 
you don't want to produce these triples each time one of your entities is 
accessed. You want to include in the representation of your entity only a link 
to that information.

A no-brainer, at first sight.

Are there any recommended ways to proceed?


Cache at the edges / leverage seeAlso.

As an aside, you may enjoy scanning over the Computational REST 
dissertation.




Re: Describing Stuff You Like using Turtle

2012-09-22 Thread Nathan

Kingsley Idehen wrote:

On 9/22/12 8:26 AM, Hugh Glaser wrote:
Sorry, I realise this is not exactly on topic (which is about crafting 
turtle, not specifically about likes), but…


It reminds me of some fun we had in 2004.
Ah, halcyon days - those balmy times before Linked Data came along.
We did a document about it for the 1st (and only? :-) ) FOAF Workshop.
http://eprints.soton.ac.uk/id/eprint/265453

whatilike.org is still there, but seems to have lost its 3store, which 
is not surprising after 8 years and several machine moves.
I guess I must have spent a good few bucks keeping the domain alive, 
waiting for the time (for one of us) to get back to it.


Who knows?
Maybe this will prompt someone.
Any one?


Hugh,

I find this on topic.

The key point we need to revisit is that triple or quad stores aren't 
mandatory for endeavors like this. The pattern can be much simpler, and 
it goes something like this (circa. 2012):


1. Signup for storage services via the likes of Dropbox, SkyDrive, 
Amazon S3 etc.. (left Google Drive, Box.NET off the list because they 
don't support mime type text/plain) -- you get 2GB free on average these 
days


2. Create a local Turtle document

3. Upload it to your service provider's folder (these are automatically 
part of your local storage setup, post installation, so no manual 
mounting is required)


4. Share you new Linked Data doc with the world

5. Linked Data aware user agents take care of the visualization etc..

All of this is now possible without:

1. Domain ownership
2. DNS server access and admin control
3. Web server access and admin control -- no need for URL re-write rules
4. A SPARQL compliant triple or quad store
5. HttRange-14 distractions and confusion re. URI disambiguation and 
patterns .


It just works.

When folks realize that they can express their Likes and DisLikes 
(amongst other things) in simple Linked Data documents over which they 
possess full access control, the game changes completely. The murkiness 
around Linked Data comprehension vaporizes.


Has anybody done a quick turtle editor with an xmlHttpRequest upload 
straight to S3 yet? it could make it even easier..


Best,

Nathan


On 19 Sep 2012, at 19:44, Kingsley Idehen  wrote:


All,

As I've often stated, there's a premature optimization bug in the 
Linked Data narrative. We early adopters concluded -- incorrectly -- 
that nobody would ever need to craft Linked Data documents by hand. 
Of course, a lot of that had to do with RDF/XML and Turtle's 
protracted journey towards W3C recommendation status. Anyway, 
focusing on the present, we have an opportunity to fix the 
aforementioned narrative bug by revisiting the value of crafting 
Linked Data documents by hand.


I've dropped a simple post showcasing the use of a Turtle document to 
describe some of the things I like [1].


Why is Turtle important?
People master new concepts by exercise. Crafting Turtle documents by 
hand brings focus back to subject-predicate-object or 
entity-attribute-value concept comprehension, with regards to basic 
sentence structure etc..


How does it aid Linked Data demystification etc?

It adds a Do-It-Yourself dimension that boils down to constructing a 
local Turtle document and publishing it to the Web, via a plethora of 
storage services that remove the following hurdles:


1. Domain Ownership
2. DNS Server access and admin level control
3. HTTP Server access and admin level control
4. URI pattern issues confusion and distraction.

Once end-users understand the basics, reinforced by simple exercises, 
it equips them with the foundation and critical context for tools 
appreciation.


Turtle is very important to Linked Data comprehension. Its a syntax 
that's user profile agnostic, unlike others that ultimately server 
specific programmer profiles:


1. Turtle -- everyone
2. HTML+Microdata -- HTML programmers
3. (X)HTML+RDFa -- (X)HTML programmers
4. JSON-LD -- Javascript programmers
5. RDF/XML -- no comment, but certainly not 1-4 :-)


Links:

1. http://bit.ly/SBDmXr -- Turtle document describing stuff I like .

--

Regards,

Kingsley Idehen   
Founder & CEO

OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen

















Re: Linked Data Business Models?

2012-07-29 Thread Nathan

Kingsley Idehen wrote:

All,

There is a tendency assume an eternal lack of functional and scalable 
business models with regards to Linked Data. I think its time for an 
open discussion about this matter.


It's no secret, I've never seen business models as challenging Linked 
Data. Quite the contrary. That said, instead of a dump from me about my 
viewpoints on Linked Data models, how about starting this discussion by 
identifying any non "Advertising based business model" that have 
actually worked on the Web to date.


As far as I know, "Advertising" and "Surreptitious Personal Profile Data 
Wholesale" are the only models that have made a difference to the bottom 
lines of: Google, Facebook, Twitter, Yahoo! and other non eCommerce 
oriented behemoths.


Based on the above, let's have a serious and frank discussion about 
business models with the understanding agreement that one size will 
never fit all, ever, so this rule cannot be overlooked re. Linked Data. 
Also remember, Business models aren't silver bullets, they are typically 
aligned with markets (qualified and quantified pain points) and the 
evolving nature of tangible and monetizable value.


Hopefully, the floor is now open to everyone that has a vested interest 
in this very important matter :-)


Perhaps linked data is the very thing hindering itself.

To explain, RDF/EAV, linked data principals and general sem-webery would 
vastly improve every single project, system, application, and website 
I've ever made for commercial clients.


However, the primary gains come from use *behind* the public interface, 
at the data and business logic tiers.


For sure that event management system I made a few years ago for big 
internationals would have, and for sure the stock management & reporting 
system I made a decade ago would have, and likewise every other project 
in between.


Linked (open) data is great, but when all the focus is on skinny public 
data, and systems themselves aren't built on the core principals, and 
investors / clients don't see the benefits to their back end business, 
then there isn't much call for them to use it.


This "business models" discussion confuses me, as every business model, 
and every business I've ever encountered, would greatly benefit from all 
the sem-webbery goodness., and greatful for it.


The adoption costs are still high though, the ontologies are often 
focussed on skinny "public" data, and ultimately who's using any of 
what's been created in a normal web based business environment.


Seriously, who here has an ecommerce shop which runs on linked data / 
rdf (as opposed to exposing it through gr, microdata etc)?
who has a stock inventory and reporting system where the core (not 
secondary/additional) database is an RDF one?
who's running a contextually aware infotizing/advertising network where 
all the data is linked/rdf?


The business models all exist already, and sem web / linked data can be 
applied to each of them.


But who's going to build things on linked open data, when they aren't 
already using it behind the public interface in their own business?


And to those who've tried, I'm sure you'll agree that there's still a 
fair bit of work to be done in order to build full business apps on top 
of rdf & linked data, many unanswered questions, and big learning curves 
for those who try.


Things are improving though, perhaps just needs an extra kick from 
people using RDF/linked data behind the public interface, instead of 
creating demo's using skinny public data.


Best,

Nathan



Re: Is there a general "preferred" property?

2012-07-18 Thread Nathan

Bernard Vatant wrote:

Nathan

Interesting discussion indeed, at least allowing me to discover
con:preferredURI I missed so far ... although I was looking for something
like that, and it was just under my nose in LOV :)
http://lov.okfn.org/dataset/lov/search/#s=preferred

If I parse correctly the definition of con:preferredURI  (A string which is
the URI a person, organization, etc, prefers that people use for them.) it
applies only to some agent able to express its preference about how
he/she/it should be identified. The domain is open, but if I was to close
it I would declare it to be foaf:Agent.


How about wikipedia articles / dbpedia articles? they have uris which 
redirect to a preferred canonical one quite often - likewise rel 
canonical as used throughout the web for "documents" ?



This is quite different from skos:prefLabel which expresses the preference
of a community of vocabulary users about how some concept should be named
(a practice coming from the library/thesaurus community). The borderline
case are authorities, when LoC uses skos:prefLabel in their authority files
for people of organization, they don't ask those people or organizations if
they agree (many of them not being in position to answer anyway ...).

Seems we lack some x:prefURI expressing the same type of preference as
skos:prefLabel.
With of course con:preferredURI rdfs:subPropertyOf x:prefURI

And a general  property  x:hasURI

x:hasURIx:preferred   x:prefURI

Meaning that :

ex:foox:hasURI  'bar'

entails

   owl:sameAs   ex:foo


agree, seems to be the same functionality as described by rel=canonical 
which is in general use too.



Not sure of notations here, what I mean by  is the resource of which
URI is the string 'bar'

And while we are at it x:altURI would be nice to have also :)


Unsure if I agree with this one.. I can't see what it brings to the 
table above multiple values for owl:sameAs, or in the other use case 
over multiple values for rdfs:label ?


Best,

Nathan


Bernard

2012/7/17 Nathan 


Good point and question! I had assumed preferred by the owner of the
object, just as you have a con:preferredURI for yourself.

The approach again comes from you, same approach as
link:listDocumentProperty (which now appears to have dropped from the link:
ontology?)

Cheers,

Nathan


Tim Berners-Lee wrote:


Interesting to go meta on this with x:preferred .

What would be the meaning of "preferred" -- "preferred by the object
itself or
the owner of the object itself"?

In other words, I wouldn't use it to store in a local store my preferred
names
for people, that would be an abuse of the property.

Tim

On 2012-07 -15, at 19:42, Nathan wrote:

 Essentially what I'm looking for is something like

 foaf:nick x:preferred foaf:preferredNick .
 rdfs:label x:preferred foaf:preferredLabel .
 owl:sameAs x:preferred x:canonical .

It's nice to have con:preferredURI and skos:prefLabel, but what I'm
really looking for is a way to let machines know that x value is preferred.

Anybody know if such a property exists yet?

Cheers,

Nathan















Re: Is there a general "preferred" property?

2012-07-17 Thread Nathan

Hugh Glaser wrote:

I think we are probably on much the same page :-)
(Most, if not all, of my questions were actually rhetorical - sorry that was 
not clear.)


It was clear, I just like talking things out :)


So you are thinking of instances as well - and of course, as instances are URIs 
like properties.
I of course make no assumption that skos:prefLabel is more "x:preferred" than 
skos:label - there are some words about it in the SKOS description, but the semantics of 
the word preferred are not defined, and might differ from the semantics for x:preferred.


exactly - it would be nice to define and expose those semantics in a 
generalized (and webized / machine readable) way.



I am quite happy to have either skos:prefLabel x:preferred skos:label or vice 
versa, although one way round is a bit strange to the human reader.
But I think I can use your x:preferred in the way I describe to select 
particular triples.

The x:preferred approach would handle this at the ontology level, as explained 
above - it would capture that when you have multiple values for :a and a single 
value for :b, and that :b x:preferred :a, then the value for :b is the 
preferred/canonical value out of those specified for :a.


I thought this described it, but now I am not certain I am quite clear:
For your example:

 :foo rdfs:label "Michael"@en, "M. Jackson"@en, "Michael Jackson"@en ;
  skos:prefLabel "Michael Jackson"@en ;


With

 rdfs:label x:preferred skos:prefLabel .


I would get that :a is skos:prefLabel and :b is rdfs:label by substitution in 
your paragraph, but :a does not have multiple values.
So maybe you mean?:
skos:prefLabel x:preferred rdfs:label
In which case the multiple :a is rdfs:label and the single :b is skos:prefLabel

Anyway, which ever way round it is, is it that the intended meaning of


other way around.


rdfs:label x:preferred skos:prefLabel

 is that the (single) object of the skos:prefLabel triple ( "Michael 
Jackson"@en) is in some sense preferred?
Or is the :foo skos:prefLabel  "Michael Jackson"@en triple in some sense 
preferred?
Or probably something else?


that out of the set of rdfs:label values ("Michael"@en, "M. Jackson"@en, 
"Michael Jackson"@en), "Michael Jackson"@en is the preferred one.


Additionally, as you may have said earlier,

  { :a x:preferred :b } entails { :b rdfs:subPropertyOf :a }

Nathan


(Not rhetorical :-) )
Best
Hugh

On 17 Jul 2012, at 16:38, Nathan 
 wrote:


Hugh Glaser wrote:

Hi,
I think Nathan is talking about properties of properties, not instances.

I am, but not in the way you think (afaict).


As a real example, in my Detail RBK/dotAC rendering I have (at least) the 
following predicates to look at for names:
(<http://xmlns.com/foaf/0.1/name> <http://www.aktors.org/ontology/portal#full-name> 
<http://www.w3.org/2000/01/rdf-schema#label> <http://www.rkbexplorer.com/ontologies/jisc#name> 
<http://www.w3.org/2004/02/skos/core#prefLabel> <http://www.w3.org/2004/02/skos/core#altLabel> 
<http://rdf.freebase.com/ns/type.object.name>)
I can either use these to gather all the names I can, or use it as an ordered 
list to get the one I prefer (if any) - it depends on the display I want to 
give.

Or you can use subPropertyOf entailment (rdfs7), when you apply it you'll find 
you have an rdfs:label triple for each of the properties you listed, other than 
portal#full-name.


So I interpreted Nathan as asking if there was anything that allowed me to say 
what the preferred order of predicate choice might be.

This is where the confusion was, if you consider the following graph:

 :foo rdfs:label "Michael"@en, "M. Jackson"@en, "Michael Jackson"@en ;
  skos:prefLabel "Michael Jackson"@en ;

As humans we know what "prefLabel" means, but a machine doesn't.

Another example:

 :foo owl:sameAs , <http://example.org/things/bits#foo> ;
  con:preferredURI :foo .

As humans we again know to treat :foo as the canonical/preferred URI for this 
thing (just as TimBL has in his foaf). Again, no machine understanding.

Thus I thought if we're going to have a proliferation of preferred style 
properties, it would be good to have a single machine property that explained 
this.

So for the two examples above you could have in the ontologies:

 rdfs:label x:preferred skos:prefLabel .
 owl:sameAs x:preferred con:preferredURI .

Then machines could understand what we do too.


Do I prefer <http://www.w3.org/2000/01/rdf-schema#label> over 
<http://www.w3.org/2004/02/skos/core#altLabel> for example (for my particular 
application)?

Your dealing with application display preferences here rather than specifying 
in descriptions of things that it is known by several values (names,uris) and 
this here one is (preferred, canonical) by the resource

Re: position in cancer informatics

2012-07-17 Thread Nathan

Can you open this right up for everybody to be involved?

I know I for one would be happy to invest free time to looking at these 
datasets to find patterns - are they open and available online, any 
pointers to get started, anything at all that would enable me (and 
hopefully others skilled here) to work on this?


It sounds like less of a "position" and more of a global need we who can 
should all be pumping time in to.


Best,

Nathan

Helena Deus wrote:
Dear all, 

We have an exciting research assistant position open at DERI for a chance to work with Cancer Informatics! We are looking for an enthusiastic developer who is familiar with bioinformatics concepts. Your role will be exploring cancer related datasets and looking for pattern (applying, for example, machine learning techniques) that can be used for personalized medicine. 

Please don't hesitate to Fw. this to whomever you think might be interested. 

To apply or to ask for more information, please reply to me (helena.d...@deri.org) with CV + motivation letter 

Kind regards, 
Helena F. Deus, PhD

Digital Enterprise Research Institute
helena.d...@deri.org









Re: Is there a general "preferred" property?

2012-07-17 Thread Nathan

Hugh Glaser wrote:

Hi,
I think Nathan is talking about properties of properties, not instances.


I am, but not in the way you think (afaict).


As a real example, in my Detail RBK/dotAC rendering I have (at least) the 
following predicates to look at for names:
(<http://xmlns.com/foaf/0.1/name> <http://www.aktors.org/ontology/portal#full-name> 
<http://www.w3.org/2000/01/rdf-schema#label> <http://www.rkbexplorer.com/ontologies/jisc#name> 
<http://www.w3.org/2004/02/skos/core#prefLabel> <http://www.w3.org/2004/02/skos/core#altLabel> 
<http://rdf.freebase.com/ns/type.object.name>)

I can either use these to gather all the names I can, or use it as an ordered 
list to get the one I prefer (if any) - it depends on the display I want to 
give.


Or you can use subPropertyOf entailment (rdfs7), when you apply it 
you'll find you have an rdfs:label triple for each of the properties you 
listed, other than portal#full-name.



So I interpreted Nathan as asking if there was anything that allowed me to say 
what the preferred order of predicate choice might be.


This is where the confusion was, if you consider the following graph:

  :foo rdfs:label "Michael"@en, "M. Jackson"@en, "Michael Jackson"@en ;
   skos:prefLabel "Michael Jackson"@en ;

As humans we know what "prefLabel" means, but a machine doesn't.

Another example:

  :foo owl:sameAs , <http://example.org/things/bits#foo> ;
   con:preferredURI :foo .

As humans we again know to treat :foo as the canonical/preferred URI for 
this thing (just as TimBL has in his foaf). Again, no machine understanding.


Thus I thought if we're going to have a proliferation of preferred 
style properties, it would be good to have a single machine property 
that explained this.


So for the two examples above you could have in the ontologies:

  rdfs:label x:preferred skos:prefLabel .
  owl:sameAs x:preferred con:preferredURI .

Then machines could understand what we do too.


Do I prefer <http://www.w3.org/2000/01/rdf-schema#label> over 
<http://www.w3.org/2004/02/skos/core#altLabel> for example (for my particular 
application)?


Your dealing with application display preferences here rather than 
specifying in descriptions of things that it is known by several values 
(names,uris) and this here one is (preferred, canonical) by the resource 
(or resource owner).



How do I represent that is what I am doing to agents asking (in RDF/OWL of 
course)?
And how would a data publisher tell my consumer agent what they think I should 
prefer?


The x:preferred approach would handle this at the ontology level, as 
explained above - it would capture that when you have multiple values 
for :a and a single value for :b, and that :b x:preferred :a, then the 
value for :b is the preferred/canonical value out of those specified for :a.


In my case, following Nathan's email, I would have a chain of 
<http://www.aktors.org/ontology/portal#full-name> x:preferred <http://xmlns.com/foaf/0.1/name> .

<http://www.w3.org/2000/01/rdf-schema#label> x:preferred  
<http://www.aktors.org/ontology/portal#full-name> .
etc.
Which would be the (meta)metadata about the service I am providing.


I think it may be more a case of capturing this using owl/rif/n3 - 
supposing you have the data:


  :foo rdfs:label "Michael"@en, "M. Jackson"@en, "Michael Jackson"@en ;
   foaf:name "Michael Jackson"@en ;

and in an ontology:

  rdfs:label x:preferred skos:prefLabel .

and you have a personal preference that says if a foaf:name and 
rdfs:label are present for something, then the foaf:name is the 
preferred value, then you can capture this with a rule like:


{ ?t rdfs:label ?l ; foaf:name ?n } => { ?t skos:prefLabel ?n }

Using several rules of this kind you could capture all your preferences, 
and your application would universally understand which to display - but 
over all properties with multiple values where one is preferred, as 
:prop1 x:preferred prop2 . would encode this.


Make sense?

Best, Nathan


However, in some sense rdfs:subPropertyOf might imply this.
Were I loading the data into a store, rather than just doing pure Linked Data 
URI resolution, I would be able to assert the rdfs:subPropertyOf relation, 
which might be seen to suggest that the most specific property (is that the 
right terminology?) is a good one to choose.
I then have the challenge of doing a query that finds that out, of course.
I am guessing that from Nathan's meaning of  x:preferred, it would seem that
x:preferred rdfs:subPropertyOf rdfs:subPropertyOf .
By the way, if I only want one preferred property, I can look one of the 
properties up in a sameAs store such as sameAs.org to find out what the 
suggested canon is (and what the other predicates might be.)

Best
Hugh

On 16 Jul 2012, at 22:28, Tim Berners-Le

Re: Is there a general "preferred" property?

2012-07-17 Thread Nathan
Good point and question! I had assumed preferred by the owner of the 
object, just as you have a con:preferredURI for yourself.


The approach again comes from you, same approach as 
link:listDocumentProperty (which now appears to have dropped from the 
link: ontology?)


Cheers,

Nathan

Tim Berners-Lee wrote:

Interesting to go meta on this with x:preferred .

What would be the meaning of "preferred" -- "preferred by the object itself or
the owner of the object itself"?

In other words, I wouldn't use it to store in a local store my preferred names
for people, that would be an abuse of the property.

Tim

On 2012-07 -15, at 19:42, Nathan wrote:


Essentially what I'm looking for is something like

 foaf:nick x:preferred foaf:preferredNick .
 rdfs:label x:preferred foaf:preferredLabel .
 owl:sameAs x:preferred x:canonical .

It's nice to have con:preferredURI and skos:prefLabel, but what I'm really 
looking for is a way to let machines know that x value is preferred.

Anybody know if such a property exists yet?

Cheers,

Nathan










Is there a general "preferred" property?

2012-07-15 Thread Nathan

Essentially what I'm looking for is something like

  foaf:nick x:preferred foaf:preferredNick .
  rdfs:label x:preferred foaf:preferredLabel .
  owl:sameAs x:preferred x:canonical .

It's nice to have con:preferredURI and skos:prefLabel, but what I'm 
really looking for is a way to let machines know that x value is preferred.


Anybody know if such a property exists yet?

Cheers,

Nathan



subDatatypes or multiple datatypes

2012-07-15 Thread Nathan

Evening all,

I've came upon a requirement to have either datatype inheritance or 
multiple datatypes for literal values.


Simply put, for application level processing and data validation it's 
very useful to have specific datatypes, for example one for a UK postal 
code, another for SSNs, another for integer product codes which must 
always be within specific bounds and so forth.


  note: using validation data in the ontology/schema (xsd: length,
  pattern, enumeration, totalDigits and so forth)

However, for generic use of the data, querying via sparql and so forth 
it's useful to have generic data types such as the common xsd: types.


What I'm really looking for here is one of the following (other 
suggestions and discussions most welcomed):


1) datatype inheritance (would require changes to sparql engines methinks)
2) multiple datatypes (would require changes to rdf and sparql afaict)
3) a way to apply xsd restrictions to properties rather than datatypes 
(perhaps the most realistic)


Any help most appreciated,

Best,

Nathan



Re: httpRange-14 Change Proposal

2012-03-28 Thread Nathan

Jeni,

First, thanks for confirming - many responses in line from here:

Jeni Tennison wrote:
> The server *can* return the same content from the /uri URI and from
> the /uri-documentation URI, but it does not have to, and it wouldn't
> be sensible to do so for an image. Your first question asked if the
> server could return the same content, your second asked if it must.

Apologies for any confusion from my wording, however I did mean "can" 
rather than "must".


In a nutshell then, this proposal says that you can return a 200 OK for 
a GET request on any URI, but if you return "a representation of a 
description of the thing referred to by " rather than "a 
representation of the thing referred to by " then you should say it 
is so by including the special " :describedby " 
triple.


Additionally, rather than special casing this so that this rule let's a 
publisher override the default 200 OK return a representation of a 
resource, the proposal also aims to change web arch and the HTTP 
specification such that a 200 OK in response to a GET no longer returns 
a representation of the requested URI, rather it just returns a 
representation which you must consult to find out what it is.


That's quite a large change to the web / web arch / http.


On 28 Mar 2012, at 16:07, Nathan wrote:

Jeni Tennison wrote:

Yes, that's correct. With no constraining Accept headers, it could alternatively return HTML 
with embedded RDFa with a  element, for example.

Is that universally true?

Suppose /uri identified a PDF formatted ebook, or a digital image of a monkey 
in JPEG format, or even an RDF document.


Then it would return those things. I think that you may have leapt to the 
conclusion that /uri *always* returns the same as /uri-documentation. There's 
nothing to my knowledge that says that, indeed given that you can have several 
:describedby links it would be impossible.


Sorry no, not *always* just *always could* or *always can*. As in, it 
would be universally true that for any successful GET request you would 
receive a representation, and that representation may be a 
representation of the , or it may be a representation of 
 which describes the target-uri.



Question A:

Currently we have:
<http://example.org/uri>; - a JPEG image of a monkey.

When you issue a GET on that URI the server currently responds
200 OK
Content-Type: image/jpeg
Link: <http://example.org/uri-documentation>;; rel="describedby"

So under this new proposal, the server can return the contents of 
/uri-documentation with a status of 200 OK for a GET on /uri?


Under the proposal, the server would return the JPEG with a 200 OK for a GET on /uri. http://example.org/uri-documentation would return a description of the JPEG in some machine-readable format. 


Or more accurately, the server MAY return the JPEG with a 200 OK for a 
GET on /uri, or it may return the same result as a successful GET on 
/uri-documentation (a description of the /uri in some machine readable 
format).


Is this limited to machine readable format, why not human readable too?

It appears that if one can return text/turtle for a GET request on 
, where {  a :Horse } then one should also be able to return 
an image/jpeg which visually describes the horse.



If yes, this seems like massively unexpected functionality, like a proposal to treat 
"Accept: some/meta-data" like a DESCRIBE verb, and seems to exaggerate the URI 
substitution problem (as in /uri would be taking as naming the representation of 
/uri-documentation).

If no, where's the language which precludes this? (and how would that language 
go, given that it's exactly the same protocol flow and nothing has changed - 
other than the reader presuming that /uri now identifies something that does 
have a representation that can be transferred over HTTP vs identifying 
something that doesn't have a representation that can be transferred over HTTP).


I don't really understand what you think it needs to say I'm afraid.





Question B:

How would conneg work, and what would the presence of a Content-Location 
response header mean? Would HTTPBis need to be updated?


I can't see any way in which any of that would work differently from currently.


Okay, given the use-case of a GET on  returning 200 OK, and the 
response containing a representation of  in text/turtle:


What would the value of the Content-Location header be? /uri-documentation?

short version: this proposal would mean many sections of httpbis would 
need to be reworded and changed, as it conflicts to the point of saying 
the opposite.



Question C:

Currently 303 "indicates that the requested resource does not have a representation of its own 
that can be transferred by the server over HTTP", and the Link header makes it clear that you 
are dealing with two different things (

Re: httpRange-14 Change Proposal

2012-03-28 Thread Nathan

Jeni Tennison wrote:

Nathan,

On 28 Mar 2012, at 16:07, Nathan wrote:

Jeni Tennison wrote:

Yes, that's correct. With no constraining Accept headers, it could alternatively return HTML 
with embedded RDFa with a  element, for example.

Is that universally true?

Suppose /uri identified a PDF formatted ebook, or a digital image of a monkey 
in JPEG format, or even an RDF document.


Then it would return those things. I think that you may have leapt to the 
conclusion that /uri *always* returns the same as /uri-documentation. There's 
nothing to my knowledge that says that, indeed given that you can have several 
:describedby links it would be impossible.


Question A:

Currently we have:
<http://example.org/uri> - a JPEG image of a monkey.

When you issue a GET on that URI the server currently responds
 200 OK
 Content-Type: image/jpeg
 Link: <http://example.org/uri-documentation>; rel="describedby"

So under this new proposal, the server can return the contents of 
/uri-documentation with a status of 200 OK for a GET on /uri?


Under the proposal, the server would return the JPEG with a 200 OK for a GET on /uri. http://example.org/uri-documentation would return a description of the JPEG in some machine-readable format. 


Previously I asked:

"With this proposal though we'd be able to say issue a GET to /uri with 
an Accept header value of text/turtle, and the server could return back 
the contents of /uri-documentation, with a status of 200 OK"


And you replied "Yes, that's correct."

The above is exactly the same, so I'll ask again:

With this proposal can a server can return the contents of 
/uri-documentation with a status of 200 OK for a GET on /uri?


If the answers yes, then it must be yes for my previous "JPEG of a 
monkey" example (universality), if the answer is no then how does this 
proposal work? Apologies for my confusion.


Best,

Nathan


If yes, this seems like massively unexpected functionality, like a proposal to treat 
"Accept: some/meta-data" like a DESCRIBE verb, and seems to exaggerate the URI 
substitution problem (as in /uri would be taking as naming the representation of 
/uri-documentation).

If no, where's the language which precludes this? (and how would that language 
go, given that it's exactly the same protocol flow and nothing has changed - 
other than the reader presuming that /uri now identifies something that does 
have a representation that can be transferred over HTTP vs identifying 
something that doesn't have a representation that can be transferred over HTTP).


I don't really understand what you think it needs to say I'm afraid.


Question B:

How would conneg work, and what would the presence of a Content-Location 
response header mean? Would HTTPBis need to be updated?


I can't see any way in which any of that would work differently from currently.


Question C:

Currently 303 "indicates that the requested resource does not have a representation of its own 
that can be transferred by the server over HTTP", and the Link header makes it clear that you 
are dealing with two different things (/uri and /uri-documentation), but where does this proposal 
make it clear at transfer protocol level that the representation included in the http response is a 
representation of another resource which describes the requested resource (rather than it being as 
the spec defines "a representation of the target resource")?


The proposal says that applications can draw no conclusions from information at 
the transfer protocol level about /uri. In particular, it can't tell whether 
the representation that is returned with /uri is *the content* of /uri or *the 
description* of /uri. Further information about /uri (eg that it is a 
foaf:Person) may help the application work out that the representation was *a 
description*.

However, an application can draw conclusions about /uri-documentation, assuming 
it gives a 2XX response, because it has been retrieved as the result of 
following a :describedby link (or if it were the target of a 303 redirection). 
The application can tell that the representation from /uri-documentation is 
*the content* of /uri-documentation and *the description* of /uri.


Either way, there is no implication that what you've got from 
http://example.org/uri is the content of http://example.org/uri (or that 
http://example.org/uri identifies an information resource), but there is an 
implication that what you get from http://example.org/uri-documentation is the 
content of http://example.org/uri-documentation (and that 
http://example.org/uri-documentation is an information resource).

Sorry I don't follow, how is there an implication from a 200 OK for  that it's 
not an IR and for  that it is an IR?


Because /uri-documentation was reached through a :describedby link. This extra 
information allow

Re: httpRange-14 Change Proposal

2012-03-28 Thread Nathan

Jeni Tennison wrote:

Nathan,

Yes, that's correct. With no constraining Accept headers, it could alternatively return HTML 
with embedded RDFa with a  element, for example.


Is that universally true?

Suppose /uri identified a PDF formatted ebook, or a digital image of a 
monkey in JPEG format, or even an RDF document.


Question A:

Currently we have:
 <http://example.org/uri> - a JPEG image of a monkey.

When you issue a GET on that URI the server currently responds
  200 OK
  Content-Type: image/jpeg
  Link: <http://example.org/uri-documentation>; rel="describedby"

So under this new proposal, the server can return the contents of 
/uri-documentation with a status of 200 OK for a GET on /uri?


If yes, this seems like massively unexpected functionality, like a 
proposal to treat "Accept: some/meta-data" like a DESCRIBE verb, and 
seems to exaggerate the URI substitution problem (as in /uri would be 
taking as naming the representation of /uri-documentation).


If no, where's the language which precludes this? (and how would that 
language go, given that it's exactly the same protocol flow and nothing 
has changed - other than the reader presuming that /uri now identifies 
something that does have a representation that can be transferred over 
HTTP vs identifying something that doesn't have a representation that 
can be transferred over HTTP).


Question B:

How would conneg work, and what would the presence of a Content-Location 
response header mean? Would HTTPBis need to be updated?


Question C:

Currently 303 "indicates that the requested resource does not have a 
representation of its own that can be transferred by the server over 
HTTP", and the Link header makes it clear that you are dealing with two 
different things (/uri and /uri-documentation), but where does this 
proposal make it clear at transfer protocol level that the 
representation included in the http response is a representation of 
another resource which describes the requested resource (rather than it 
being as the spec defines "a representation of the target resource")?



Either way, there is no implication that what you've got from 
http://example.org/uri is the content of http://example.org/uri (or that 
http://example.org/uri identifies an information resource), but there is an 
implication that what you get from http://example.org/uri-documentation is the 
content of http://example.org/uri-documentation (and that 
http://example.org/uri-documentation is an information resource).


Sorry I don't follow, how is there an implication from a 200 OK for 
 that it's not an IR and for  that it is an IR?


If there was a Set of all Things (Set-A), then that set would have two 
sets, "the set of all things which can be transferred via a transfer 
protocol like HTTP" (Set-B), and then everything else (Set-C) which 
comprises Set-A minus Set-B. As far as I can tell, the one thing that 
determines whether something is a member of the Set-B or Set-C, for 
HTTP, is that 200 OK in response to a GET, hence why we need the 303.


This proposal appears to try and override that "rule" (fact) by saying 
let the content of a representation define what is a member of Set-B or 
Set-C, however the act of dereferencing itself is what determines 
whether an identified thing is a member of Set-B, as Set-B is the set of 
all things that can be dereferenced. Hence my confusion at this proposal.


Hope that makes sense, and that I've not totally misunderstood.

Best,


Jeni

On 28 Mar 2012, at 14:46, Nathan wrote:


Nathan wrote:

Jeni Tennison wrote:

# Details

In section 4.1, in place of the second paragraph and following list, substitute:

 There are three ways to locate a URI documentation link in an HTTP response:

  * using the Location: response header of a 303 See Other response 
[httpbis-2],  e.g.

303 See Other
Location: http://example.com/uri-documentation>

  * using a Link: response header with link relation 'describedby' ([rfc5988],  
[powder]), e.g.

200 OK
Link: <http://example.com/uri-documentation>; rel="describedby"

  * using a ‘describedby’ ([powder]) relationship within the RDF graph created 
by  interpreting the content of a 200 response, eg:

200 OK
Content-Type: text/turtle

PREFIX :<http://www.iana.org/assignments/relation/>
<http://example.com>:describedby 
<http://example.com/uri-documentation> ;
  .


Seeking clarification,
Given some arbitrary thing and a description of that thing, let's say:
 <http://example.org/uri> is described by <http://example.org/uri-documentation>
Previously we could GET /uri and either:
a) follow the value of the Location header in a 303 response to get to 
/uri-documentation
b) follow the value of the Link header to get to /uri-documentation
With this proposal though

Re: httpRange-14 Change Proposal

2012-03-28 Thread Nathan

Nathan wrote:

Jeni Tennison wrote:

# Details

In section 4.1, in place of the second paragraph and following list, 
substitute:


  There are three ways to locate a URI documentation link in an HTTP 
response:


   * using the Location: response header of a 303 See Other response 
[httpbis-2],  e.g.


 303 See Other
 Location: http://example.com/uri-documentation>

   * using a Link: response header with link relation 'describedby' 
([rfc5988],  [powder]), e.g.


 200 OK
 Link: <http://example.com/uri-documentation>; rel="describedby"

   * using a ‘describedby’ ([powder]) relationship within the RDF 
graph created by  interpreting the content of a 200 response, eg:


 200 OK
 Content-Type: text/turtle

 PREFIX :<http://www.iana.org/assignments/relation/>
 <http://example.com>:describedby 
<http://example.com/uri-documentation> ;

   .



Seeking clarification,

Given some arbitrary thing and a description of that thing, let's say:

  <http://example.org/uri> is described by 
<http://example.org/uri-documentation>


Previously we could GET /uri and either:

a) follow the value of the Location header in a 303 response to get to 
/uri-documentation


b) follow the value of the Link header to get to /uri-documentation

With this proposal though we'd be able to say issue a GET to /uri with 
an Accept header value of text/turtle, and the server could return back 
the contents of /uri-documentation, with a status of 200 OK, and where 
the text/turtle response contained:


PREFIX :<http://www.iana.org/assignments/relation/>
<http://example.com> :describedby <http://example.com/uri-documentation>


PREFIX :<http://www.iana.org/assignments/relation/>
<http://example.org/uri> :describedby <http://example.org/uri-documentation>

c+p error, apologies.


Is this correct?

TIA







Re: httpRange-14 Change Proposal

2012-03-28 Thread Nathan

Jeni Tennison wrote:

# Details

In section 4.1, in place of the second paragraph and following list, substitute:

  There are three ways to locate a URI documentation link in an HTTP response:

   * using the Location: response header of a 303 See Other response [httpbis-2], 
 e.g.


 303 See Other
 Location: http://example.com/uri-documentation>

   * using a Link: response header with link relation 'describedby' ([rfc5988], 
 [powder]), e.g.


 200 OK
 Link: ; rel="describedby"

   * using a ‘describedby’ ([powder]) relationship within the RDF graph created by 
 interpreting the content of a 200 response, eg:


 200 OK
 Content-Type: text/turtle

 PREFIX :
  
   :describedby  ;

   .



Seeking clarification,

Given some arbitrary thing and a description of that thing, let's say:

   is described by 



Previously we could GET /uri and either:

a) follow the value of the Location header in a 303 response to get to 
/uri-documentation


b) follow the value of the Link header to get to /uri-documentation

With this proposal though we'd be able to say issue a GET to /uri with 
an Accept header value of text/turtle, and the server could return back 
the contents of /uri-documentation, with a status of 200 OK, and where 
the text/turtle response contained:


PREFIX :
 :describedby 

Is this correct?

TIA




Re: Thought: 207 Description Follows

2012-03-28 Thread Nathan
minor note: Content-Location could be used to provide a URI for the 
descriptor document, thus Conneg compatible out off the box.


Nathan wrote:

Hi,

I believe TimBL has suggested this previously with a 208, however both 
207 and 208 are already assigned or mentioned in various DAV 
communities, thus 209 or higher would have to be used I believe.


Personally, I like the idea a lot, and the usefulness for IoT is great 
too - any convergence between the semantic web and IoT, especially at 
HTTP and descriptor level, would be great.


Best,

Nathan

Kjetil Kjernsmo wrote:

Hi all!

I hope it is OK that I just burst in here without having followed the 
discussion. Admittedly, I haven't been terribly interested, I've 
always enjoyed the 303 dance, I wrote the code and it was easy, and 
the IR/NIR distinction has always served me well. However, I also see 
that it is a bit painful to have to do follow a redirect, both for end 
users and for code, and the bit about cachability, CORS problems, etc 
makes it clear there is room for improvement.


So, how about a new HTTP response code, e.g. 207 Description Follows? 
I.e., it is like 200 OK, but makes it clear that what you're 
dereferencing is not an IR. Instead, you're getting a description of 
the thing.


This would have implications well beyond our community, GETting the 
URI of a device in the Internet of Things would  also reasonably 
return a 207. Without having thought too deeply about this, I suggest 
this means it satisfies the orthogonality of specifications constraint.


I just quickly hacked a server to test how browsers would react to a 
207 code, and all browser I have did it gracefully. I therefore 
conjecture that clients needing to know the IR/NIR distinction will be 
able to figure it out by looking at the status code only, those that 
need not, would not need to be bothered.  
Deployment costs should thus be very low. We're also working to get 
our code into Debian (older versions are already in Ubuntu), so if we 
have this settled before Debian Wheezy freezes in June, it would be 
available in mainstream hosting solutions late this year. I think 
that's a key, because many users control very little of their server 
setup, and custom code is "dangerous", but with the support of Debian 
the costs for hosters are marginal.

Naively Yours,

Kjetil











Re: Thought: 207 Description Follows

2012-03-28 Thread Nathan

Hi,

I believe TimBL has suggested this previously with a 208, however both 
207 and 208 are already assigned or mentioned in various DAV 
communities, thus 209 or higher would have to be used I believe.


Personally, I like the idea a lot, and the usefulness for IoT is great 
too - any convergence between the semantic web and IoT, especially at 
HTTP and descriptor level, would be great.


Best,

Nathan

Kjetil Kjernsmo wrote:

Hi all!

I hope it is OK that I just burst in here without having followed the 
discussion. Admittedly, I haven't been terribly interested, I've always 
enjoyed the 303 dance, I wrote the code and it was easy, and the IR/NIR 
distinction has always served me well. However, I also see that it is a bit 
painful to have to do follow a redirect, both for end users and for code, and 
the bit about cachability, CORS problems, etc makes it clear there is room for 
improvement.


So, how about a new HTTP response code, e.g. 207 Description Follows? I.e., it 
is like 200 OK, but makes it clear that what you're dereferencing is not an 
IR. Instead, you're getting a description of the thing.


This would have implications well beyond our community, GETting the URI of a 
device in the Internet of Things would  also reasonably return a 207. Without 
having thought too deeply about this, I suggest this means it satisfies the 
orthogonality of specifications constraint.


I just quickly hacked a server to test how browsers would react to a 207 code, 
and all browser I have did it gracefully. I therefore conjecture that clients 
needing to know the IR/NIR distinction will be able to figure it out by looking 
at the status code only, those that need not, would not need to be bothered. 
 
Deployment costs should thus be very low. We're also working to get our code 
into Debian (older versions are already in Ubuntu), so if we have this settled 
before Debian Wheezy freezes in June, it would be available in mainstream 
hosting solutions late this year. I think that's a key, because many users 
control very little of their server setup, and custom code is "dangerous", but 
with the support of Debian the costs for hosters are marginal. 


Naively Yours,

Kjetil








Re: Change Proposal 25 for HttpRange-14

2012-03-25 Thread Nathan
What's the difference between this and Content-Location (which I believe 
Ian suggested a year or two ago)?


Best, Nathan

Tim Berners-Lee wrote:

Jonathan,

I have written the below idea up as a change proposal 
http://www.w3.org/wiki/HTML/ChangeProposal25


The number "25" has no semantics.

Tim

On 2012-03 -25, at 12:35, Tim Berners-Lee wrote:


[...]  the basic idea of giving a way of the server making it
explicit that the URI identifies not the document but is subject, without the 
internet round-trip time of 303,
is a useful path to go down.

If Ian Davis and co would be happy with it, how about a header

200 OK
Document:  foo123476;doc=yes

which means "Actually the URI you gave is not the URI of a this document,
but the URI of this document is  foo123476.html (a relative URI).

- This is the same as doing a 301 to foo123476.html and returning the same 
content.
- Non-data clients will ignore it, and just show users the page anyway.
- Saves the round trip time of 301
- Avoids having the same URI for the document and its subject.

This will dismantle HTTP range-14 a bit more, but still never give the same
URI to two things.  It would mean code changes to my client code and just a 
reconfig
change to Ian's server. 


Tim













Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Nathan

Norman Gray wrote:

Nathan, hello.
On 2011 Oct 20, at 12:54, Nathan wrote:

Norman Gray wrote:

Ugh: 'IR' and 'NIR' are ugly obscurantist terms (though reasonable in their 
original context).  Wouldn't 'Bytes' and 'Thing', respectively, be better (says 
he, plaintively)?
Both are misleading, since NIR is the set of all things, and IR is a 
proper subset of NIR, it doesn't make much sense to label it "non 
information resource(s)" when it does indeed contain information 
resources. From that perspective "IR" and "R" makes somewhat more sense.


That's true, and clarifying.

Or, more formally, R is the set of all resources (?equivalent to "things named by a 
URI").  IR is a subset of that, defined as all the things which return 200 when you 
dereference them. NIR is then just R \ IR.


Indeed, I just wrote pretty much the same thing, but with a looser 
definition at [1], snipped here:


""
The only potential clarity I have on the issue, and why I've clipped 
above, is that I feel the /only/ property that distinguishes an "IR" 
from anything else in the universe, is that it has a 
[transfer/transport]-protocol as a property of it. In the case of HTTP 
this would be anything that has an HTTP Interface as a property of it.


If we say that anything with this property is a member of set X.

If an interaction with the thing named , using protocol 'p:', is 
successful, then  is a member of X.


An X of course, being what is currently called an "Information Resource".

Taking this approach would then position 303 as a clear opt-out built in 
to HTTP which allows a server to remain indifferent and merely point to 
some other X which may, or may not, give one more information as to what 
 refers to.

""

[1] http://lists.w3.org/Archives/Public/www-tag/2011Oct/0078.html

That's my understanding of things any way.


It's NIR that's of interest to this discussion, but there's no way of 
indicating within HTTP that a resource is in that set [1], only that something 
is in IR.


Correct, and I guess technically, and logically, HTTP can only ever have 
awareness of things which have an HTTP Interface as a property. So 
arguing for HTTP to cater for non HTTP things, seems a little illogical 
and I guess, impossible.



Back to your regularly scheduled argumentation...


Aye, as always, carry on!

Best,

Nathan



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-20 Thread Nathan

Norman Gray wrote:

Ugh: 'IR' and 'NIR' are ugly obscurantist terms (though reasonable in their 
original context).  Wouldn't 'Bytes' and 'Thing', respectively, be better (says 
he, plaintively)?


Both are misleading, since NIR is the set of all things, and IR is a 
proper subset of NIR, it doesn't make much sense to label it "non 
information resource(s)" when it does indeed contain information 
resources. From that perspective "IR" and "R" makes somewhat more sense.




Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-20 Thread Nathan

Kingsley Idehen wrote:

On 10/20/11 2:38 AM, Michael Smethurst wrote:

name = generic information resource urblah

You assign names to data objects. A data object is an encapsulation of 
data that can be simple of complex. In all cases data objects are 
accessible via addresses. In all cases, you access a data object via  
act of de-reference (irrespective of levels of indirection).


address = specific representation url (exposed in content location 
headers and rel="alternate" < forgot that bit earlier)


address (a URL) = how you get at the data, basically, data object access 
is the prime function. That isn't necessarily the case if you use a URL 
as a generic name i.e., one doesn't assume data object access, you can 
only assume data object identification.


The intuition challenge here is that URLs are being perceived as being 
indistinguishable from URIs at both the functional and conceptual 
levels. A URL being a kind of URI implies they are related but not 
identical. Thus, using URLs as data object names is quite *unintuitive* 
but extremely *ingenious*, especially in the context of the World Wide Web.


Trying to follow, can you confirm:

  URL = Absolute / non-frag URI?
  URI = URI-with-frag?

And you're suggesting that we do not name things, rather we name data 
objects (each data object "represents"/"describes" a thing), and since 
they are data objects we can name them with either a URL or a URI, the 
only distinction being that when naming with a URL (and no redirect) you 
can only provide a serialization of one data object in response to a 
lookup on an address (URL), whereas with a URI you can can provide a 
serialization of several data objects (since URI contains a URL).


Or are you saying that a URL = an address and a URI = a name, and a 
single URI is not (or cannot be both) a name and an address?


Or, are you saying that we can use a single URL/URI as both a name and 
an address (which afaict, everybody already does).


Best,

Nathan



Re: Address Bar URI

2011-10-20 Thread Nathan

Michael Smethurst wrote:



On 20/10/2011 01:18, "Nathan"  wrote:


Dave Reynolds wrote:

The problem, as I see it, is that developers start from the NIR but then
use web browsers to find their way round the data and then cut paste the
browser locations they find, thus ending up with IRs where they should
have had NIRs. 

Agree, you put that very nicely Dave.

Perhaps Michael nailed it when he mentioned separation of concerns, one
could suggest that this is what happens when the data-tier has knowledge
of the presentation-tier (i.e. punting the user to a view of the data,
rather than the data directly). That itself is quite possibly the
product of using a web browser as a data browser.

I think it's fair to say that nothing is going to clean up the mess, so
perhaps it's just a case of looking at tooling to sanity check our data.

Hugh's javascript would make a fine bookmarklet, click it and it changes
the URI in the "address bar" to the NIR URI rather than the IR URI
(assuming a 1-1 relation that is).




Whilst I'm failing to lurk as well as:


is there room for:


to expose the nir uri? Maybe with a bookmarklet / greasemonkey style script
to pull out the nir uri and display it to anyone interested. Maybe even
using replaceState on the address bar :-)

Maybe this already exists



:) sounds like header) - a Bookmarklet/greasemonkey script should be pretty easy to 
make to do this.


best,


Further, surely it must be possible to create a tool which quickly
sanity checked triples, almost like a semantic web version of Google's
"did you mean?"

If you write:

  fbase:Italy owl:sameAs <http://dbpedia.org/page/Italy> .

Then any number of checks could be made, for example that the class of
Country is distinct from the class of Document, perhaps even hooking in
on the primaryTopic relation.

It's clear after all these years that people will publish data however
they want, guidance will be ignored, and that humans make mistakes - so
perhaps we should be relying on machine understanding of our data, to
correct our very human mistakes. Wherever possible that is :)

Best,

Nathan



http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.








Re: Address Bar URI

2011-10-19 Thread Nathan

Dave Reynolds wrote:

The problem, as I see it, is that developers start from the NIR but then
use web browsers to find their way round the data and then cut paste the
browser locations they find, thus ending up with IRs where they should
have had NIRs. 


Agree, you put that very nicely Dave.

Perhaps Michael nailed it when he mentioned separation of concerns, one 
could suggest that this is what happens when the data-tier has knowledge 
of the presentation-tier (i.e. punting the user to a view of the data, 
rather than the data directly). That itself is quite possibly the 
product of using a web browser as a data browser.


I think it's fair to say that nothing is going to clean up the mess, so 
perhaps it's just a case of looking at tooling to sanity check our data.


Hugh's javascript would make a fine bookmarklet, click it and it changes 
the URI in the "address bar" to the NIR URI rather than the IR URI 
(assuming a 1-1 relation that is).


Further, surely it must be possible to create a tool which quickly 
sanity checked triples, almost like a semantic web version of Google's 
"did you mean?"


If you write:

 fbase:Italy owl:sameAs <http://dbpedia.org/page/Italy> .

Then any number of checks could be made, for example that the class of 
Country is distinct from the class of Document, perhaps even hooking in 
on the primaryTopic relation.


It's clear after all these years that people will publish data however 
they want, guidance will be ignored, and that humans make mistakes - so 
perhaps we should be relying on machine understanding of our data, to 
correct our very human mistakes. Wherever possible that is :)


Best,

Nathan



Re: Where to put the knowledge you add

2011-10-19 Thread Nathan

Hugh Glaser wrote:

On 20 Oct 2011, at 00:23, Nathan wrote:

Hugh Glaser wrote:

Hi.
I have argued for a long time that the linkage data (in particular owl:sameAs 
and similar links) should not usually be mixed with the knowledge being 
published.
Thus, for example as I discussed with Evan for the NYTimes site a while ago, it 
is not a good thing to put the owl:sameAs links (which were produced by a 
relatively unskilled individual over a short period of time) at the same status 
as the other data, which has been curated over decades by expert reporters.
These sameAs links have potentially very different trust,  provenance, licence, 
and possibly other non-functional attributes from the substantive data.
Clearly they have different trust and provenance, but licence may well be 
different, as the NYT may want people to take the triples away to bring traffic 
to their site, while keeping the other triples under more restricted licence.

seeAlso and put that information in to a different document, available upon 
request.

Sounds good.
But does not really address the problem.
Where do you put the seeAlso?
In the same Graph/store, with the same provenance/licence?
The particular predicate is not the issue - it is how you communicate the 
provenance/licence etc. of the knowledge, and if it gets the same 
provenance/licence as the data it is about through being put in the same place.


Whatever happens behind the interface is perhaps of no concern, do that 
however it can be done, on the public side of things simply split the 
data in to different documents, and apply provenance/license data on a 
per document basis. Thus allowing you to stick the perhaps untrustworthy 
sameAs assertions in one document, and the more trustworthy assertions 
in another, primary, document.


As for how exactly you communicate each documents provenance / license, 
well that's perhaps another topic, one not for me!


Best,

Nathan



Re: Where to put the knowledge you add

2011-10-19 Thread Nathan

Hugh Glaser wrote:

Hi.

I have argued for a long time that the linkage data (in particular owl:sameAs 
and similar links) should not usually be mixed with the knowledge being 
published.

Thus, for example as I discussed with Evan for the NYTimes site a while ago, it 
is not a good thing to put the owl:sameAs links (which were produced by a 
relatively unskilled individual over a short period of time) at the same status 
as the other data, which has been curated over decades by expert reporters.

These sameAs links have potentially very different trust,  provenance, licence, 
and possibly other non-functional attributes from the substantive data.
Clearly they have different trust and provenance, but licence may well be 
different, as the NYT may want people to take the triples away to bring traffic 
to their site, while keeping the other triples under more restricted licence.


seeAlso and put that information in to a different document, available 
upon request.




Re: Address Bar URI

2011-10-19 Thread Nathan

David Wood wrote:

On Oct 19, 2011, at 10:40, Kingsley Idehen wrote:
We don't believe is forcing issues on end-users by disrupting them via actions such as: implementing a Linked Data URI style for something like DBpedia that works modulo IE 6. 


I respect your position, but I do :)


Under what scenario would DBPedia have ever returned a 30x response with 
a frag-uri in the Location header?


If I understand the IE6 issue being discussed correctly, the bug only 
occurred when IE6 followed a 302 with a frag in the Location. And even 
if that were the case, how would that have any consequence to the end 
user, surely it's a slight inconvenience for the server only, which 
would have to absolute~ify the frag-URI first?




Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-19 Thread Nathan

Leigh Dodds wrote:

On 19 October 2011 20:48, Kingsley Idehen  wrote:

On 10/19/11 3:16 PM, Leigh Dodds wrote:

RFC3983:

"A Uniform Resource Identifier (URI) is a compact sequence of
characters that identifies an abstract or physical resource."

Yes, I agree with that.

2 URIs, therefore 2 resources.

I disagree with your interpretation though.


But I'm not interpreting anything there. The definition is a URI
identifies a resource. Ergo two different URIs identify two resources.


Nonsense, and I'm surprised to hear it.

Given two distinct URIs the most you can determine is that you have two 
distinct URIs.


You do not know how many resources are identified, there may be no 
resources, one, two, or full sets of resources.


Do see RFC3986, especially the section on equivalence.



Re: Browser Extension for setting HTTP headers

2011-07-31 Thread Nathan

Michael Hausenblas wrote:


Does anyone know a browser extension that will allow one to set the 
'Accept:' HTTP header and follow redirects (a la curl -L), but 
actually show what it's done (a la curl -i)?


Hopefully one that works in both Firefox and Chrome (a la Poster, but 
without this lack).


Why a browser extension? :)


I typically use http://redbot.org/ or http://hurl.it/ with a slight 
preference for the former ...


Or you can use XMLHttpRequest which allows setting the Accept header 
(CORS-beware!)


Best,

Nathan



Re: Fwd: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...]

2011-06-19 Thread Nathan

Danny Ayers wrote:

I feel very guilty being in threads like this. Shit fuck smarter people than
me.


Just minor, and I can hardly talk as I swear most often in different 
settings, but I am a little surprised to see this language around here. 
I quite like having an arena where these words don't arise in the 
general conversation.


Ack you know what I'm saying - nothing personal, but I'd personally 
appreciate not seeing them too frequently around here :)


Best!



Re: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...]

2011-06-19 Thread Nathan

Nathan wrote:

Henry Story wrote:

On 19 Jun 2011, at 18:27, Giovanni Tummarello wrote:

but dont be surprised as  less and less people will be willing to 
listen as more and more applications (Eg.. all the stuff based  on 
schema.org) pop up never knowing there was this problem... (not in 
general. of course there is in general, but for their specific use 
cases)


The question is if schema.org makes the confusion, or if the schemas 
published there use a DocumentObject ontology where the distinctions 
are clear but the rule is that object relationships are in fact going 
via the primary topic of the document. I have not looked at the 
schema, but it seems that before arguing that they are inconsistent 
one should see if there is not a consistent interpretation of what 
they are doing.


Sorry, I'm missing something - from what I can see, each document has a 
number of items, potentially in a hierarchy, and each item is either 
anonymous, or has an @itemid.


Where's the confusion between Document and Primary Subject?


Or do you mean from the Schema.org side, where each Type and Property 
has a dereferencable URI, which currently happens to also eb used for 
the document describing the Type/Property?




Re: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...]

2011-06-19 Thread Nathan

Henry Story wrote:

On 19 Jun 2011, at 18:27, Giovanni Tummarello wrote:


but dont be surprised as  less and less people will be willing to listen as 
more and more applications (Eg.. all the stuff based  on schema.org) pop up 
never knowing there was this problem... (not in general. of course there is in 
general, but for their specific use cases)


The question is if schema.org makes the confusion, or if the schemas published 
there use a DocumentObject ontology where the distinctions are clear but the 
rule is that object relationships are in fact going via the primary topic of 
the document. I have not looked at the schema, but it seems that before arguing 
that they are inconsistent one should see if there is not a consistent 
interpretation of what they are doing.


Sorry, I'm missing something - from what I can see, each document has a 
number of items, potentially in a hierarchy, and each item is either 
anonymous, or has an @itemid.


Where's the confusion between Document and Primary Subject?



Re: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...]

2011-06-19 Thread Nathan

Exactly,

Things become even clearer when you add in a messenger.

A messenger carried a message about an erupting volcano, to conflate the 
message and the subject of the message is to say that a messenger 
carried an erupting volcano, which is nonsense.


We've long since known not to conflate the Messenger with the Message, 
this is why we don't shoot the messenger, however I think this is 
possibly the first time in history where we've questioned whether the 
message and the subject(s) of the message were different things or not.


Best,

Nathan

Pat Hayes wrote:

Really (sorry to keep raining on the parade, but) it is not as simple as this. 
Look, it is indeed easy to not bother distinguishing male from female dogs. One 
simply talks of dogs without mentioning gender, and there is a lot that can be 
said about dogs without getting into that second topic. But confusing web 
pages, or documents more generally, with the things the documents are about, 
now that does matter a lot more, simply because it is virtually impossible to 
say *anything* about documents-or-things without immediately being clear which 
of them - documents or things - one is talking about. And there is a good 
reason why this particular confusion is so destructive. Unlike the 
dogs-vs-bitches case, the difference between the document and its topic, the 
thing, is that one is ABOUT the other. This is not simply a matter of ignoring 
some potentially relevant information (the gender of the dog) because one is 
temporarily not concerned with it: it is two different ways of usin
g the very names that are the fabric of the descriptive representations themselves. It confuses language with language use, confuses language with meta-language. It is like saying giraffe has seven letters rather than "giraffe" has seven letters. Maybe this does not break Web architecture, but it certainly breaks **semantic** architecture. It completely destroys any semantic coherence we might, in some perhaps impossibly optimistic vision of the future, manage to create within the semantic web. So yes indeed, the Web will go on happily confusing things with documents, partly because the Web really has no actual contact with things at all: it is entirely constructed from documents (in a wide sense). But the SEMANTIC Web will wither and die, or perhaps be still-born, if it cannot find some way to keep use and mention separate and coherent. So far, http-range-14 is the only viable suggestion I have seen for how to do this. If anyone has a better one, let us discuss it. But just 
blandly assuming that it will all come out in the wash is a bad idea. It won't. 


Pat

On Jun 18, 2011, at 1:51 PM, Danny Ayers wrote:


On 17 June 2011 02:46, David Booth  wrote:


I agree with TimBL that it is *good* to distinguish between web pages
and dogs -- and we should encourage folks to do so -- because doing so
*does* help applications that need this distinction.  But the failure to
make this distinction does *not* break the web architecture any more
than a failure to distinguish between male dogs and female dogs.

Thanks David, a nice summary of the most important point IMHO.

Ok, I've been trying to rationalize the case where there is a failure
to make the distinction, but that's very much secondary to the fact
that nothing really gets broken.

Cheers,
Danny.

http://danny.ayers.name





IHMC (850)434 8903 or (650)494 3973   
40 South Alcaniz St.   (850)202 4416   office

Pensacola(850)202 4440   fax
FL 32502  (850)291 0667   mobile
phayesAT-SIGNihmc.us   http://www.ihmc.us/users/phayes













Re: HTTP 302

2011-06-17 Thread Nathan

Alan Ruttenberg wrote:

On Fri, Jun 17, 2011 at 4:56 PM, Nathan  wrote:


Christopher Gutteridge wrote:


One last comment, it's a shame we use a code meaning "See Other"

You could get a lot of useful mileage out of a 3XX code meaning "Is
Described By"



and what if you got two of those 3XX's chained, what would be being
described?

-> GET /A
-< 30X /B
-> GET /B
-< 30X /C
-> GET /C
-< 200 OK

does /C describe /A or /B ?


/B (assuming 30X = 303)


Sorry I meant 30X to be a new status code meaning "Is Described By". 
That said, 303 doesn't mean that /C describes anything, it just 
indicates that the requested resource does not have a representation of 
its own that can be transferred by the server over HTTP.



Can you offer an interpretation otherwise?


Well, what if it describes /A, or something else entirely, or nothing at 
all? It seems like a tall ask for a server responding to one URI to say 
what another URI is (specify that another URI describes something) - 
perhaps the weakness of the "see other" statement is an architectural 
strength in the web.


Best,

Nathan



Re: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...]

2011-06-17 Thread Nathan

Henry Story wrote:

On 17 Jun 2011, at 22:42, Nathan wrote:


You could use the same name for both if each name was always coupled to a 
universe, specified by the predicate, and you cut out type information from 
data, such that:

 :animalname "sasha" ; :created "2011" .

was read as:

Animal() :animalname "sasha" .
Document() :created "2011" .

the ability to do this could be pushed on to ontologies, with domain and range 
and restrictions specifying universes and boundaries - but it's a big change.


No its quite simple in fact, as I pointed out in a couple of e-mails in this 
thread. You just need to be careful when creating relations that certain 
relations are in fact inferred relations between primary topics.


I'd agree, but anything that involves being careful is pretty much 
doomed to failure on the web :p



really, different names for different things is quite simple to stick to,


yes, but there are a lot of people who say it is too complicated. I don't find 
it so, but perhaps it is for their use cases. I say that we describe the option 
they like, find out what the limitations are they will fall have, and document 
it. Then next time we can refer others to that discovery.

So limitations to look for would be limitations as to the complexity of the 
data created. The other limitations is that even on simple blog pages there are 
at least three or four things on the page.


there's also a primary limitation of the programming languages 
developers are using, if they've got locked in stone classes and 
objects, or even just structures, then the dynamics of RDF can be pretty 
hard to both understand mentally, and use practically.



and considering most (virtually all) documents on the web have several 
different elements and identifiable things,


indeed.


the one page one subject thing isn't worth spending too much time focusing on 
as a generic use case, as any solution based on it won't apply to the web at 
large which is very diverse and packed full of lots of potentially identifiable 
things.


agree. But it is one of those things that newbies feel the urge to do, and will 
keep on wanting to do. So perhaps for them one should have special simple 
ontologies or guides for how to build these ObjectDocument ontologies. In any 
case this seems to be the type of thing the microformats people were (are?) 
doing.


hmm.. microformats seems to be pretty focussed on describing multiple 
items on one page, however the singularity is present in that they 
focussed on being described using a single Class Blueprint style, one 
class, a predetermined set of properties belonging to the class, and a 
simple chained heirarchy - this stems from most OO based languages.


With a bit of trickery you can use RDF and OWL the same way, it just 
means you have different "views" over the data, where you can see 
Human(x) with a set of properties, or Male(x) with another set, or 
Administrator(x) with yet another set. This is less about the data 
published and more about how it's consumed viewed and processed though.


Quite sure something can be done with that, where the simple version of 
the data uses a basic schema.org like ontology, and advanced usage is 
more RDF like using multiple ontologies. The "views" thing would be a 
way to merge the two approaches..


Best,

Nathan



Re: HTTP 302

2011-06-17 Thread Nathan

Christopher Gutteridge wrote:

One last comment, it's a shame we use a code meaning "See Other"

You could get a lot of useful mileage out of a 3XX code meaning "Is 
Described By"




and what if you got two of those 3XX's chained, what would be being 
described?


-> GET /A
-< 30X /B
-> GET /B
-< 30X /C
-> GET /C
-< 200 OK

does /C describe /A or /B ?

303 is a nice loose way of saying "/x may give you more information", 
stressing the "may" in that sentence, as it equally may not.




Re: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...]

2011-06-17 Thread Nathan

Alan Ruttenberg wrote:

Pat's knows something about the history of
what's known to work and what isn't. You ignore that history at the peril of
your ideas simply not working.


well said, although I think we could bracket yourself in that category 
too :)





Re: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...]

2011-06-17 Thread Nathan

Danny Ayers wrote:

On 16 June 2011 02:26, Pat Hayes  wrote:


If you agree with Danny that a description can be a substitute for the thing it 
describes, then I am waiting to hear how one of you will re-write classical 
model theory to accommodate this classical use/mention error. You might want to 
start by reading Korzybski's 'General Semantics'.


IANAL, but I have heard of the use/mention thing, quite often. I don't
honestly know whether classical model theory needs a rewrite, but I'm
sure it doesn't on the basis of this thread. I also don't know enough
to know whether it's applicable - from your reaction, I suspect not.

As a publisher of information on the Web, I'm pretty much free to say
what I like (cf. Tim's Design Notes). Fish are bicycles. But that
isn't very useful.

But if I say Sasha is some kind of weird Collie-German Shepherd cross,
that has direct relevance to Sasha herself. More, the arcs in my
description between Sasha and her parents have direct correspondence
with the arcs between Sasha and her parents. There is information
common to the reality and the description (at least in human terms).
The description may, when you stand back, be very different in its
nature to the reality, but if you wish to make use of the information,
such common aspects are valuable. We've already established that HTTP
doesn't deal with any kind of "one true" representation. Data about
Sasha's parentage isn't Sasha, but it's closer than a non-committal
303 or rdfs:seeAlso. There's nothing around HTTP that says it can't be
given the same name, and it's a darn sight more useful than a
wave-over-there redirect or a random fish/bike association. I can't
see anything it breaks either.


You could use the same name for both if each name was always coupled to 
a universe, specified by the predicate, and you cut out type information 
from data, such that:


  :animalname "sasha" ; :created "2011" .

was read as:

 Animal() :animalname "sasha" .
 Document() :created "2011" .

the ability to do this could be pushed on to ontologies, with domain and 
range and restrictions specifying universes and boundaries - but it's a 
big change.


really, different names for different things is quite simple to stick 
to, and considering most (virtually all) documents on the web have 
several different elements and identifiable things, the one page one 
subject thing isn't worth spending too much time focusing on as a 
generic use case, as any solution based on it won't apply to the web at 
large which is very diverse and packed full of lots of potentially 
identifiable things.


best, nathan



Re: Schema.org considered helpful

2011-06-17 Thread Nathan

you should post to the lists more harry :)

Harry Halpin wrote:

I've been watching the community response to schema.org for the last
bit of time. Overall, I think we should clarify why people are upset.
First, there should be no reason to be upset that the major search
engines went off and created their own vocabularies. According to the
argument of decentralized extensibility, schema.org *exactly* what
Google/Yahoo!/Microsoft are supposed to be doing. It's a
straightfoward site that clearly for how the average Web developer can
use structured data in markup to solve real-world use-cases and
provides examples.  That's the entire vision of the Semantic Web, let
a thousand ontologies bloom with no central control.

The reason people are upset are that they didn't use RDFa, but instead
used microdata. One *cannot* argue that Google is ignoring open
standards. RDFa and microdata are *both* Last Call W3C Working Drafts
now. RDFa 1.0 is a spec but only for XHTML 1.0, which is not what most
of the Web uses. Microdata does have RDF parsing bugs, but again, most
developers outside the Semantic Web probably don't care - they want
JSON anyways.

Form what I understand from tevents  where Rich Snippets team has
presented is that RDFa is simply too complicated for ordinary web
developers to use. Google has been deploying Rich Snippets for two
years, claim to have user-studies  and have experience with a large
user-base. This user-driven feedback should be taken on board by both
relevant WGs obviously, HTML and RDFa. Designing technology without
user-feedback leads to odd results (for proof, see many of the fun and
exiciting "httpRange-14" discussions). Which is also why many
practical developers do not use the technology.

But realistically, it's not the RDFa WG's job to do user-studies and
build compelling user-experiences in products. They are only a few
people. Why has the *hundreds* of people in the Semantic Web community
not done such work?

The fact of the matter is that the Semantic Web academic community has
had their priorities skewed to the wrong direction. Had folks been
spending time doing usability testing and focussing on user-feedback
on common problems (such as the rather obvious "vocabulary hosting"
problem) rather than focussing on things with little to no support
with the world outside academia, then we probably would not be in the
situation we are in today. Today, major companies such as Microsoft
(oData) and Google (microdata) are jumping on the "open data"
bandwagon but finding the RDF stack unacceptable. Some of it may be a
"not invented here" syndrome, but as anyone who has actually looked at
RDF/XML can tell you, some of it is hard-to-deny technical reasoning
by companies that have decided that "open data" is a great market but
do not agree with the technical choices made by the  Semantic Web
stack.

This is not to say good things can't come out of the academic
community - the *internet* came out of the academic community. But
seriously, at some point (think of the role of Netscape in getting the
Web going with the magic of images) commercial companies enter the
game. We should be happy now search engines are seeing value in
structured data on the Web.

I would suggest the Semantic Web community take on-board the
"microdata" challenge in two different ways. First of all, start
focussing on user-studies and user experience (not just visual
interfaces, the Semantic Web has more than its share of user-hostile
visual interfaces). It's harder to publish academic papers on these
topics but possible (see SIGCHI), and would help a lot with actual
deployment. Second, we should start focussing more on actual empirical
data-driven feedback, both on what parts of RDF are being used and
common mistakes. With indexes such as the Billion Triple Challenge and
Sindice's index, we can actually do that with the Semantic Web. Third,
why not actually try to get RDF - or "open data more broadly" into the
browser in usable manner? Tabulator may be a step in the right
direction, but the user experience needs work. Fourth, why not start a
company and try to deliver products to actual end-users and give that
feedback to the wider community and W3C WGs (and if you already work
for an actual SemWeb company, please send your feedback from user
studies to the WG before Last Call)? I believe the Semantic Web
research community - which still has tons of funding and lots of
passion - can make the Web better.

Schema.org is not a threat. It's an opportunity to step up. Good luck everyone!

   cheers,
  harry

P.S.: Note this opinions are purely personal and held as an individual.








Re: Squaring the HTTP-range-14 circle

2011-06-17 Thread Nathan

could also term it "constrained vs diverse" :)

David Wood wrote:

Hi all,

This thread seems to me to be classic "neat vs. scruffy" argument [1].  I used 
to be a neat, when I was young, foolish and of course selfish.  Now that I am old enough 
to see others' points of view, I have become scruffy.  Either that, or I'm just tired of 
trying to force others to do things my way.

The Web is a scruffy place and that is a feature, not a bug.

Regards,
Dave

[1] http://en.wikipedia.org/wiki/Neats_vs._scruffies


On Jun 17, 2011, at 10:27, Kingsley Idehen wrote:


On 6/17/11 2:55 PM, Ian Davis wrote:

BUT when the click a "Like" button on a blog they are expressing they like the

 blog, not the movie it is about.

 AND when they click "like" on a facebook comment they are
 saying they like the comment not the thing it is commenting on.

 And on Amazon people say "I found this review useful" to
 like the review on the product being reviewed, separately from
 rating the product.
 So there is a lot of use out there which involves people expressing
 stuff in general about the message not its subject.

As an additional point, a review_is_  a seperate thing, it's not a web
page. It is often contained within a webpage. It seems you are
conflating the two here. Reviews and comments can be and often are
syndicated across multiple sites so clearly any "liking" of the review
needs to flow with it.

Yes, it is a separate thing representable as a Data Object. Now the obvious 
question: what is a Web Page? Isn't that a sourced from Data at an Address 
that's streamed to a client that uses a specific data presentation metaphor as 
basis for user comprehension?

Are the following identical or different, re. URI functionality ?

1. http://dbpedia.org/resource/Linked_Data
2. http://dbpedia.org/page/Linked_Data
3. http://dbpedia.org/data/Linked_Data.json .

I may want to bookmark: http://dbpedia.org/page/Linked_Data, I may also be 
interested in its evolution over time via services lime memento [1] .

The thing is that re. WWW we have an Information Space dimension and associated 
patterns that's preceded the Data Space dimension and emerging patterns that we 
(this community) are collectively trying to crystallize, in an unobtrusive 
manner.

Links:

1. http://www.mementoweb.org/guide/quick-intro/

--

Regards,

Kingsley Idehen 
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
















Re: Squaring the HTTP-range-14 circle

2011-06-17 Thread Nathan

Kingsley Idehen wrote:

On 6/17/11 3:11 PM, Leigh Dodds wrote:

I just had to go and check whether Amazon reviews and Facebook
comments actually do have their own pages. That's because I've never
seen them presented as anything other than objects within another
container, either in a web page or a mobile app. So I think you could
argue that when people are "linking" and marking things as useful,
they're doing that on a more general abstraction, i.e. the "Work" (to
borrow FRBR terminology) not the particular web page.


You have to apply context to your statement above. Is the context: WWW 
as an Information space or Data Space? These contexts can co-exist, but 
we need to allow users context-switch, unobtrusively. Thus,  they have 
to co-exist, and that's why we have to leverage what the full URI 
abstraction delivers. As stated earlier, it doesn't mean others will 
follow or understand immediately, you need more than architecture for 
that;  hence the need for a broad spectrum of solutions that do things 
properly.




and UX challenges, indeed if the ux was addressed first for the 
functionality, then whatever was implemented could be webized and 
standardized - could be a good way to force innovation in this area.




Re: Squaring the HTTP-range-14 circle

2011-06-17 Thread Nathan

Ian Davis wrote:

As an additional point, a review _is_ a seperate thing, it's not a web
page. It is often contained within a webpage. It seems you are
conflating the two here. Reviews and comments can be and often are
syndicated across multiple sites so clearly any "liking" of the review
needs to flow with it.


so the "like" data needs to be webized and exposed easily.

also, realtime updates on / streams of such data would come in very 
useful. permissions and visibility would looked at though, so probably 
authentication via webid or other would be needed too.




Re: Squaring the HTTP-range-14 circle

2011-06-17 Thread Nathan

Tim Berners-Lee wrote:
And on Amazon people say "I found this review useful" to 
like the review on the product being reviewed, separately from

rating the product.
So there is a lot of use out there which involves people expressing 
stuff in general about the message not its subject.


yes, common use case, many sites give karma to comments / reviews and 
have links to them both in and out of context.


When the cost os just fixing Microdata syntax to make it easy to 
say things about the subject of a page.


far from expert on microdata, but @itemid may well cater for this.




a reply to many posts

2011-06-17 Thread Nathan
mmunity, may not 
happen, but there's a risk of it, and it does appear that there's a 
strong long term message of "rdf is too complicated, we like to do 
things like this [x]", the same message can be found in microdata, and 
ignoring it could be risky.


apologies for being a bit quiet of late, and for the cruffyness of this 
mail - not been too well, trying to slowly get back in to things at the 
minute - hope to be back on form soon.


best, nathan



Re: Labels separate from localnames (Was: Best Practice for Renaming OWL Vocabulary Elements

2011-04-22 Thread Nathan

Kingsley Idehen wrote:

On 4/22/11 7:36 AM, Martin Hepp wrote:

See replies inline ;-)
Sorry to say this, but I think you are making a mistake. To say that 
the rdfs:label has to look like a variable name because it is for Web 
developers sounds to me like you are saying that the javadoc of a 
method should look like a piece of code because it is addressed to 
programmers. I refuse to believe that Web developers understand 
better pseudo code than natural language.
I will finally give in to use English spacing and capitalization for 
rdfs:labels in GoodRelations, e.g. use


"Business entity"@en for gr:BusinessEntity etc.

But I will keep the cardinality recommendation in the rdfs:label of 
properties, e.g.


 serial number (0..*) for gr:serialNumber


Why not move that to rdfs:comment? 


+1 seems more like a comment or a description from where I'm standing 
too, rather than a label.




Re: Linked Data, Blank Nodes and Graph Names

2011-04-09 Thread Nathan

Nathan wrote:

Pat Hayes wrote:

On Apr 9, 2011, at 4:05 PM, Nathan wrote:


Michael Brunnbauer wrote:
I would prefer a way of skolemizing that does not depend on the 
graph name

and can be done by producer *and* consumer of RDF on a voluntary base.
It should be a standard with reference implementations in all important
languages for:
-generating a skolem URI
-converting an unskolemized RDF serialization to a skolemized one
-converting a skolemized RDF serialization to an unskolemized one
It is important that skolem URIs would be recognizeable.

I agree, why a URI?


Because the only point of this entire thread and discussion is to make 
RDF more regular, by replacing bnodes with URIs, so that all names in 
all triples are URIs or literals. Thus, conforming RDF will be 
simplified from having three kinds of node to two (URIs and literals). 
If we introduce something other than a URI, we will have gone from 
three to four kinds of node, which does not strike me as a 
simplification. 


"It is important that skolem URIs would be recognizeable.", what would 
the purpose of them being recognizable, if there were only literals and 
URIs?


(I'm taking you to be talking about loosing ∃ from RDF, and others to be 
trying to find a way to keep the ability to say something, and changing 
that to "something, let's call it X, that has ..")


As in the requirements section here:
  http://www.w3.org/wiki/BnodeSkolemization




Re: Linked Data, Blank Nodes and Graph Names

2011-04-09 Thread Nathan

Pat Hayes wrote:

On Apr 9, 2011, at 4:05 PM, Nathan wrote:


Michael Brunnbauer wrote:

I would prefer a way of skolemizing that does not depend on the graph name
and can be done by producer *and* consumer of RDF on a voluntary base.
It should be a standard with reference implementations in all important
languages for:
-generating a skolem URI
-converting an unskolemized RDF serialization to a skolemized one
-converting a skolemized RDF serialization to an unskolemized one
It is important that skolem URIs would be recognizeable.

I agree, why a URI?


Because the only point of this entire thread and discussion is to make RDF more regular, by replacing bnodes with URIs, so that all names in all triples are URIs or literals. Thus, conforming RDF will be simplified from having three kinds of node to two (URIs and literals). If we introduce something other than a URI, we will have gone from three to four kinds of node, which does not strike me as a simplification. 


"It is important that skolem URIs would be recognizeable.", what would 
the purpose of them being recognizable, if there were only literals and 
URIs?


(I'm taking you to be talking about loosing ∃ from RDF, and others to be 
trying to find a way to keep the ability to say something, and changing 
that to "something, let's call it X, that has ..")


Best,

Nathan



Re: Linked Data, Blank Nodes and Graph Names

2011-04-09 Thread Nathan

Michael Brunnbauer wrote:

I would prefer a way of skolemizing that does not depend on the graph name
and can be done by producer *and* consumer of RDF on a voluntary base.
It should be a standard with reference implementations in all important
languages for:

-generating a skolem URI
-converting an unskolemized RDF serialization to a skolemized one
-converting a skolemized RDF serialization to an unskolemized one

It is important that skolem URIs would be recognizeable.


I agree, why a URI? and why invent a new identifier (possibly getting 
all kinds of duplicates over time and creating a management nightmare) 
when often people/machines have already given nodes a form of 
reference/identifier?


Why turn _:b1 in to some new uri, when often you could just say that 
_:b1 is short for http://example.org/doc#[_:b1] ? (which isn't a valid 
IRI but does have all the beneficial properties of an IRI)


Best,

Nathan



Re: Linked Data, Blank Nodes and Graph Names

2011-04-07 Thread Nathan

Kingsley Idehen wrote:

On 4/7/11 1:45 PM, Nathan wrote:

[snips]
4) create a subset of RDF which does have a way of differentiating 
blank nodes from URI-References, where each blank node is named 
persistently as something like ( graph-name , _:b1 ), which would 
allow the subset to be effectively "ground" so that all the benefits 
of stable names and set operations are maintained for data management, 
but where also it can be converted (one way) to full RDF by removing 
those persistent names.


Generally, this thread perhaps differs from others, by suggesting that 
rather than changing RDF, we could layer on a set of specs which cater 
for all linked data needs, and allow that linked data to be considered 
as full RDF (with existential) when needed.


It appears to me, that if most people would be prepared to make the 
trade off of loosing the [ ] syntax and anonymous objects such that 
you always had a usable name for each thing, and were prepared to 
modify and upgrade tooling to be able to use this 
not-quite-rdf-but-rdf-compatible thing, then we could solve many real 
problems here, without changing RDF itself.


That said, it's a trade-off, hence, do the benefits outweigh the cost 
for you?


Maybe it boils down making what Blank Nodes are a lot clearer. In the 
real-world I can assert 'existence' of something that possesses a 
collection of characteristics without having to specifically Name the 
Subject of my observations. I think capturing the context of the 
assertions ultimately alleviates the pain.


I agree, however this would effectively amount to still being able to 
use "_:b1" in syntax, and that amounting to saying "something exists, 
let us call it _:b1 within G, that has..", the benefit of that being 
that you're still explicitly saying 'something' rather than 'this thing 
called ', but also giving it a persistent name to use /within/ a 
named g-box (and g-snaps thereof).


Filtering through to the real world, this means that somebody can select 
_:b from  twice and get the same results each time, or quickly 
establish that the triples contained in one graph of subrdf are equal to 
those in another graph of subrdf, and so forth.


One has to wonder, if [] had never been in turtle, would it have made 
any difference to rdf uptake.. (_: syntax doesn't count, as that still 
involves giving something a name of sorts).


Best,

Nathan



Linked Data, Blank Nodes and Graph Names

2011-04-07 Thread Nathan

Hi All,

To cut a long story short, blank nodes are a bit of a PITA to work with, 
they make data management more complex, new comers don't "get" them 
(lest presented as anonymous objects), and they make graph operations 
much more complex than they need be, because although a graph is a set 
of triples, you can't (easily) do basic set operations on non-ground 
graphs, which ultimately filters down to making things such as graph 
diff, signing, equality testing, checking if one graph is a super/sub 
set of another very difficult. Safe to say then, on one side of things 
Linked Data / RDF would be a whole lot simpler without those blank nodes.


It's probably worth asking then, in a Linked Data + RDF environment:

- would you be happy to give up blank nodes?

- just the [] syntax?

- do you always have a "name" for your graphs? (for instance when 
published on the web, the URL you GET, and when in a store, the ?G of 
the quad?


I'm asking because there are multiple things that could be done:

1) change nothing

2) remove blank nodes from RDF

3) create a subset of RDF which doesn't have blank nodes and only deals 
with ground graphs


4) create a subset of RDF which does have a way of differentiating blank 
nodes from URI-References, where each blank node is named persistently 
as something like ( graph-name , _:b1 ), which would allow the subset to 
be effectively "ground" so that all the benefits of stable names and set 
operations are maintained for data management, but where also it can be 
converted (one way) to full RDF by removing those persistent names.


Generally, this thread perhaps differs from others, by suggesting that 
rather than changing RDF, we could layer on a set of specs which cater 
for all linked data needs, and allow that linked data to be considered 
as full RDF (with existential) when needed.


It appears to me, that if most people would be prepared to make the 
trade off of loosing the [ ] syntax and anonymous objects such that you 
always had a usable name for each thing, and were prepared to modify and 
upgrade tooling to be able to use this not-quite-rdf-but-rdf-compatible 
thing, then we could solve many real problems here, without changing RDF 
itself.


That said, it's a trade-off, hence, do the benefits outweigh the cost 
for you?


Best,

Nathan



Re: LOD Cloud Cache Stats

2011-04-04 Thread Nathan

Kingsley Idehen wrote:

On 4/3/11 11:41 PM, Nathan wrote:

Hi Kinglsey, All,

Incoming open request, could anybody provide similar statistics for 
the usage of each datatype in the wild (e.g. the xsd types, xmlliteral 
and rdf plain literal)?


Ideally Kingsley, could you provide a breakdown from the lod cloud 
cache? would be very very useful to know.


Best & TIA,

Nathan

Kingsley Idehen wrote:
I've knocked up a Google spreadsheet that contains stats about our 21 
Billion Triples+ LOD cloud cache.

...
https://spreadsheets.google.com/ccc?key=0AihbIyhlsQSxdHViMFdIYWZxWE85enNkRHJwZXV4cXc&hl=en 
-- LOD Cloud Cache SPARQL stats queries and results




Nathan,

The typed literals used in>  10k triples:

countdatatype IRI
11308xsd:anyURI
12553http://dbpedia.org/datatype/day
12788http://dbpedia.org/ontology/day
15875http://dbpedia.org/ontology/usDollar
18228http://dbpedia.org/datatype/usDollar
20828http://europeanaconnect.eu/voc/fondazione/sgti#fondazioneNot
22934http://statistics.data.gov.uk/def/administrative-geography/StandardCode 


23368http://www.w3.org/2001/XMLSchema#date
30695http://dbpedia.org/datatype/inhabitantsPerSquareKilometre
31662http://dbpedia.org/datatype/second
35506http://dbpedia.org/datatype/kilometre
57409http://www.w3.org/2001/XMLSchema#int
160117http://stitch.cs.vu.nl/vocabularies/rameau/RecordNumber
632256http://www.w3.org/2001/XMLSchema#anyURI
1175435  xsd:string
1696035http://data.ordnancesurvey.co.uk/ontology/postcode/Postcode
70194534http://www.openlinksw.com/schemas/virtrdf#Geometry
120147725http://www.w3.org/2001/XMLSchema#string

Spreadsheet will be updated too.



Thanks Kingsley, very much appreciated! :)

I have to admit I'm surprised by the lack of xsd:double and xsd:decimal 
in the two stats sets, and also the inclusion of some datatypes I'd 
never even heard of!


Are there any virtuozo specific nuances which do some conversion, or are 
all of these as found in the serialized RDF?


also is xsd:string automatically set for all plain literals (with / 
without langs?)


Cheers,

Nathan



Re: LOD Cloud Cache Stats

2011-04-03 Thread Nathan

Hi Kinglsey, All,

Incoming open request, could anybody provide similar statistics for the 
usage of each datatype in the wild (e.g. the xsd types, xmlliteral and 
rdf plain literal)?


Ideally Kingsley, could you provide a breakdown from the lod cloud 
cache? would be very very useful to know.


Best & TIA,

Nathan

Kingsley Idehen wrote:
I've knocked up a Google spreadsheet that contains stats about our 21 
Billion Triples+ LOD cloud cache.

...
https://spreadsheets.google.com/ccc?key=0AihbIyhlsQSxdHViMFdIYWZxWE85enNkRHJwZXV4cXc&hl=en 
-- LOD Cloud Cache SPARQL stats queries and results




Re: Several questions about Linked Data

2011-03-23 Thread Nathan

Richard Light wrote:
I took Zhou's question to be about the need for a separate mechanism for 
connecting data sets, above and beyond the use of the same URI to 
represent the same concept.


on that note, has much been done recently on backlinks and link servers? 
(other than sameas.org)


Best,

Nathan



Re: Quick reality check please

2011-03-20 Thread Nathan

select distinct ?s where {?s ?p "Arts and Humanities Research Council"@en}

:)

Hugh Glaser wrote:

dbpedia sparql endpoint is not doing what I expect.

select distinct ?s where {?s ?p "World Wide Web Consortium"}
gives an answer
select distinct ?s where {?s ?p "Arts and Humanities Research Council"}
doesn't.

But both
http://dbpedia.org/resource/World_Wide_Web_Consortium
and
http://dbpedia.org/resource/Arts_and_Humanities_Research_Council
are there with dbpprop:name and rdfs:label and the string.

What am I doing wrong?

The two queries as the actual URLs:

http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+distinct+%3Fs+where+%7B%3Fs+%3Fp+%22World+Wide+Web+Consortium%22%7D&debug=on&timeout=&format=text%2Fhtml&save=display&fname=

http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+distinct+%3Fs+where+%7B%3Fs+%3Fp+%22Arts+and+Humanities+Research+Council%22%7D&debug=on&timeout=&format=text%2Fhtml&save=display&fname=

Hope this is not a senior moment, but suspect it is :-)







issues naming/deploying data

2011-03-02 Thread Nathan

Hi All,

There are certain practical issues relating to how you name things, 
httpRange-14 and the well covered 303/# ground.


I'm wondering, are there any more problems like this people have 
experienced, for example how do you follow your nose to RDF about jpegs.


If anybody has anything similar, and new or not covered well, can they 
let me know.


Cheers,

Nathan



Re: ANN: http://www.productontology.org - more than 300,000 specific OWL DL classes for types of objects

2011-03-02 Thread Nathan

s/Nathan/Toby ? ;)

Martin Hepp wrote:
Hi Nathan, 


[] a gr:ActualProductOrServiceInstance ,
<http://www.productontology.org/id/London>
  gr:description "Condition: used." .


see http://www.productontology.org/#faq5 and 
http://www.productontology.org/#faq6

"Q: Why is everything a gr:ProductOrService? Isn't this wrong and dangerous?

The semantics of gr:ProductOrService is basically that of a tangible or 
intangible object on which rights can granted or transferred, so even if social 
conventions tell us that rain, love, health, longevity, or sex should not be 
traded, they are not necessarily invalid as subclasses of gr:ProductOrService, 
because in some environments, it may be perfectly valid to sell rain or seek 
health by means of RDF and GoodRelations.

Q. Your idea sucks: I can even get a class definition for those Wikipedia 
lemmata that make absolutely no sense as a class.

First, this is not question but a statement. Second, yes, you are absolutely 
right: You can request a class definition for John F. Kennedy or Massachusetts 
in 2010. However,there is absolutely no harm in providing a nonsense class 
definition, unless someone uses this to annotate an object.

The only classes that we filter out are those for Wikipedia disambiguation 
pages, since they are mostly irrelevant as classes.

Our approach is grounded in the idea of Human Computation: Instead of identifying 
valid lemmata beforehand, we rather watch which identifiers will be used in 
real-world data. Again, meaningless class definitions do not harm; meaningless data 
may."

Best
Martin









Re: ANN: http://www.productontology.org - more than 300,000 specific OWL DL classes for types of objects

2011-03-01 Thread Nathan

Martin,

Very nice and most useful! (300,000!) great work :)

Best,

Nathan

Martin Hepp wrote:

Dear all:

We are happy to release http://www.productontology.org, an online-service
that provides valid OWL DL class definitions for all of the ca. 300,000 types
of products or services that are contained in the 3.5 million Wikipedia entries.

In short, www.productontology.org provides for the schema level what DBpedia 
provides for the data / instance level of the Semantic Web.


A few examples:

   Laser_printer   http://www.productontology.org/id/Laser_printer
   Manure_spreader http://www.productontology.org/id/Manure_spreader
   Racing_bicycle  http://www.productontology.org/id/Racing_bicycle
   Soldering_iron  http://www.productontology.org/id/Soldering_iron
   Sweet_potatohttp://www.productontology.org/id/Sweet_potato

The Product Ontology is designed to be compatible with the GoodRelations 
Ontology for e-commerce, but it can be used for any other Semantic Web or

Linked Data purpose that requires class definitions for objects.

All Wikipedia translations are preserved.

Background informations and FAQs: 
  http://www.productontology.org/#faq


Examples in RDF/XML, Turtle, and RDFa:
  http://www.productontology.org/#examples

Any feedback is highly appreciated!

Acknowledgments: Thanks to Axel Polleres, Andreas Radinger, Alex Stolz, and
Giovanni Tummarello for very valuable feedback. 


The work on The Product Types Ontology has been supported by the German Federal
Ministry of Research (BMBF) by a grant under the KMU Innovativ program as part
of the Intelligent Match project (FKZ 01IS10022B).


martin hepp
e-business & web science research group
universitaet der bundeswehr muenchen

e-mail:  h...@ebusiness-unibw.org
phone:   +49-(0)89-6004-4217
fax: +49-(0)89-6004-4620
www: http://www.unibw.de/ebusiness/ (group)
http://www.heppnetz.de/ (personal)
skype:   mfhepp 
twitter: mfhepp











Re: The truth about SPARQL Endpoint availability

2011-03-01 Thread Nathan

Richard Cyganiak wrote:

On 1 Mar 2011, at 00:14, Pierre-Yves Vandenbussche wrote:

Average 24h and 7days availability are just extra info. Colour associated to an 
endpoint is given based on its availability right now. That means:

-a red endpoint is NOT available right now.
-an orange endpoint is available but had some troubles since last 24h.
-a green endpoint is available without any trouble since last 24h.


Ok. But my point stands: There's a big difference between “down this very 
moment” (red) and “has never been up in the last 7 days”. The former might be 
back in 15 minutes. The latter most likely is gone permanently. I think it 
would be good to capture that in the choice of color.


good/fair point, I'd suggest:

-a red endpoint has not been available for X hours/days.
-an orange endpoint is available but had some troubles since last 24h.
-a green endpoint is available without any trouble since last 24h.

even better if X was a "last seen X hours/days ago"

cheers,

nathan



Re: Google's structured seach talk / Google squared UI

2011-02-11 Thread Nathan
All very nice, might be worth mentioning Michael Hausenblas' fine (WIP) 
addrable here too:


  https://github.com/mhausenblas/addrable

Best,

Nathan

Ian Davis wrote:

Hi,

I did something very similar to Google Squared in small php script a
couple of years ago:

http://iandavis.com/2009/lodgrid/?store=space&query=jupiter&columns=6

It uses linked data held in the Talis Platform and the platform's full
text search service.

More examples linked from the main page:

http://iandavis.com/2009/lodgrid/

Ian

On Fri, Feb 11, 2011 at 10:23 AM, Daniel O'Connor
 wrote:

Hi all,
This talk might have been seen by some of you; but was certainly new to me:
http://www.youtube.com/watch?v=5lCSDOuqv1A&feature=autoshare
Much of this is an exploration of how google is making use of freebase's
underlying linked data to better understand what they are crawling -
deriving what something is by examining its attributes; and automatically
creating something like linked data from it.

Additionally; it talks about Google squared - this tool appears to be
heavily powered by freebase data; as well as derived data from the web. I
was fairly impressed by the mix of understanding a user query and rendering
results as actual entities (one of the few non-facet based UIs I have seen).
For instance: "territorial authorities in new zealand"
http://www.google.com/squared/search?q=territorial+authorities+in+new+zealand
Whilst this is not using the typical linked data technology stack of RDF,
SPARQL, open licenced data, etc; it certainly shows you what can be done
with data in a graph structure; plus a UI which is a cross between a
spreadsheet and a search result.












Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-21 Thread Nathan

Harry Halpin wrote:

On Thu, Jan 20, 2011 at 11:15 AM, Nathan  wrote:

Out of interest, where is that process defined? I was looking for it the
other day - for instance in the quoted specification we have the example:

32.18

Where's the bit of the XML specification which says you join them up by
concatenating 'http://ecommerce.example.org/schema' with #(?assumed?) and
'Euro' to get 'http://ecommerce.example.org/schema#Euro'?



Actually you don't. A namespace is just that - a tuple (namespace,
localname) in XML. That's why namespaces in XML are far all intents
and purposes broken and why, to a large extent, Web browser developers
in HTML stopped using them and hate implementing them in the DOM, and
so refuse to have them in HTML5. And that's one reason RDF(A) will
probably continue getting a sort of bad rap in the HTML world, as
prefixes are not associated with just making URIs, but with this
terrible namespace tuple.

For an archeology of the relevant standards, check out Section "What
Namespaces Do" of this paper. While the paper is focussed on why
namespace documents are a mess, the relevant information is in that
section and extensively referenced, with examples:

http://xml.coverpages.org/HHalpinXMLVS-Extreme.html


Ahh, thanks for explaining that one Harry, most helpful :)

Best,

Nathan



Re: Standardizing linked data - was Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Nathan

Nathan wrote:

Dave Reynolds wrote:
All this presupposes some work to formalize and specify linked data. 
Is there anything like that planned?  In some ways Linked Data is an 
engineering experiment and benefits from that freedom to experiment. 
On the other hand interoperability eventually needs clear specifications.


Unsure, but I'll also ask the question, is there anything planned? I'd 
certainly +1 standardization and do anything I could to help the process 
along.


or perhaps an IG/XG follow up to the SWEO, taking in to account Read 
Write Web of Data, hopefully with a some protocol or best practice 
report giving a migration path to standardization?


There are certainly plenty of other groups to take in to account and 
consider in all of this, like the WebID XG.


Best,

Nathan



Standardizing linked data - was Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Nathan

Dave Reynolds wrote:

Okay, I agree, and I'm really not looking to create a lot of work here,
the general gist of what I'm hoping for is along the lines of:

RDF Publishers MUST perform Case Normalization and Percent-Encoding
Normalization on all URIs prior to publishing. When using relative URIs
publishers SHOULD include a well defined base using a serialization
specific mechanism. Publishers are advised to perform additional
normalization steps as specified by URI (RFC 3986) where possible.

RDF Consumers MAY normalize URIs they encounter and SHOULD perform Case
Normalization and Percent-Encoding Normalization.

Two RDF URIs are equal if and only if they compare as equal, character
by character, as Unicode strings.


I sort of OK with that but ...

Terms like "RDF Publisher" and "RDF Consumer" need to be defined in 
order to make formal statements like these. The RDF/OWL/RIF specs are 
careful to define what sort of processors are subject to conformance 
statements and I don't think RDF Publisher is a conformance point for 
the existing specs.


This may sound like nit-picking that's life with specifications. You 
need to be clear how the last para about "RDF URIs" relates to notions 
like "RDF Consumer".


I wonder whether you might want to instead define notions of Linked Data 
Publisher and Linked Data Consumer to which these MUST/MAY/SHOULD 
conformance statements apply. That way it is clear that a component such 
as an RDF store or RDF parser is correct in following the existing RDF 
specs and not doing any of these transformations but that in order to 
construct a Linked Data Consumer/Publisher some other component can be 
introduced to perform the normalizations. Linked Data as a set of 
constraints and conventions layered on top of the RDF/OWL specs.


Fully agree, had the same conversation with DanC this afternoon and he 
too immediately suggested changing RDF Publisher/Consumer to Linked Data 
Publisher/Consumer. Also ties in with earlier comments about 
standardizing Linked Data, however it's done, or worded, my only care 
here is that it positively impacts the current situation, and doesn't 
negatively impact anybody else.


The specific point on the normalization ladder would have to defined, of 
course, and you would need to define how to handle schemes unknown to 
the consumer.


All this presupposes some work to formalize and specify linked data. Is 
there anything like that planned?  In some ways Linked Data is an 
engineering experiment and benefits from that freedom to experiment. On 
the other hand interoperability eventually needs clear specifications.


Unsure, but I'll also ask the question, is there anything planned? I'd 
certainly +1 standardization and do anything I could to help the process 
along.



For many reasons it would be good to solve this at the publishing phase,
allow normalization at the consuming phase (can't be precluded as
intermediary components may normalize), and keep simple case sensitive
string comparison throughout the stack and specs (so implementations
remain simple and fast.)


Agreed.


cool, thanks again Dave,

Nathan



Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Nathan

Martin Hepp wrote:

On 20.01.2011, at 15:40, Nathan wrote:

David Booth wrote:

On Thu, 2011-01-20 at 13:08 +, Dave Reynolds wrote:
[ . . . ]


To make sure that dereference returns what I expect, independent of
aliasing, then I should publish data with explicit base URIs (or just
absolute URIs). Publishing with relative URIs and no base is a recipe
for having your data look different from different places. Just 
don't do

it.

This advice sounds like an excellent candidate for publication in a best
practices document.  And if it is merely best practice guidance, perhaps
that *is* something that the new RDF working group could address.


+1 from me, address at the publishing phase, allow at the consuming 
phase, keep comparison simple.


I am not sure whether you are also talking of RDFa, but in case you do, 
I would like to add the following:


Hi Martin,

Yes (re RDFa), see: http://webr3.org/urinorm/2 - all the browsers do the 
normalization so you can't even get to the non-normalized URI.


in a browser you'll note that all the URIs get normalized automatically, 
in that it's impossible to programmatically access the "correct" casing. 
That's a problem.


if you run it through the RDFa distiller at w3.org [2] you'll find:

   dc:creator <http://WEBR3.org/nathan#me> .

  <http://WEBR3.org/urinorm/2#example> dc:title "URI Normalization 
Example 2" .


note one of the URIs (the one which required relative path resolution) 
has the scheme normalised.


if you run if through check.rdfa.info you'll find that all the URIs are 
normalized. [3]


if you run it through sigma [4] you'll find everything has been 
normalized. You can also see an RDF view of this [5]


if you run it through URI Burner [6], you'll find that /some/ URIs have 
been normalized. It's also worth noting that this caused all kinds of 
problems - I ended up having to create a new resource at this point w/ 
some RDF & N3 to test URI Burner:


  http://webr3.org/urinorm/3

which lead to the empty [7] then I figured I'd try [8] and if you click 
the creator ( htTp://WEBR3.org/nathan#me ) since in this case there's no 
normalization (not it was normalized in [6]) you get a 400 Bad Request [9].


and so on and so forth - far from ideal.

Best,

Nathan

[1] http://www.rdfabout.com/demo/validator/ (normalizes all RDF URIs)
[2] http://www.w3.org/2007/08/pyRdfa/
[3] http://check.rdfa.info/check?url=http://webr3.org/urinorm/2&version=1.0
[4] http://sig.ma/search?q=http://webr3.org/urinorm/2
[5] http://sig.ma/entity/e6a2c8319bb3bf21f4b4639216f114a4.rdf#this
[6] 
http://linkeddata.uriburner.com/about/html/http/webr3.org/urinorm/2%01this

[7] http://linkeddata.uriburner.com/about/html/http/webr3.org/urinorm/3
[8] http://linkeddata.uriburner.com/about/html/htTp://WEBR3.org/urinorm/3
[9] http://linkeddata.uriburner.com/about/html/htTp/WEBR3.org/nathan%01me



Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Nathan

David Booth wrote:

On Thu, 2011-01-20 at 13:08 +, Dave Reynolds wrote:
[ . . . ]

It seems to me that this is primarily a issue with publishing, and a
little about being sensible about how you pass on links. If I'm going to
put up some linked data I should mint normalized URIs; I should use the
same spelling of the URIs throughout my data; I'll make sure those URIs
dereference and that the data that comes back is stable and useful. If
someone else refers to my resources using an aliased URI (such as a
different case for the protocol) and makes statements about those
aliases then they have simply made a mistake.

To make sure that dereference returns what I expect, independent of
aliasing, then I should publish data with explicit base URIs (or just
absolute URIs). Publishing with relative URIs and no base is a recipe
for having your data look different from different places. Just don't do
it. 


This advice sounds like an excellent candidate for publication in a best
practices document.  And if it is merely best practice guidance, perhaps
that *is* something that the new RDF working group could address.


+1 from me, address at the publishing phase, allow at the consuming 
phase, keep comparison simple.





Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Nathan
27;m really not looking to create a lot of work here, 
the general gist of what I'm hoping for is along the lines of:


  RDF Publishers MUST perform Case Normalization and Percent-Encoding 
Normalization on all URIs prior to publishing. When using relative URIs 
publishers SHOULD include a well defined base using a serialization 
specific mechanism. Publishers are advised to perform additional 
normalization steps as specified by URI (RFC 3986) where possible.


  RDF Consumers MAY normalize URIs they encounter and SHOULD perform 
Case Normalization and Percent-Encoding Normalization.


  Two RDF URIs are equal if and only if they compare as equal, 
character by character, as Unicode strings.


For many reasons it would be good to solve this at the publishing phase, 
allow normalization at the consuming phase (can't be precluded as 
intermediary components may normalize), and keep simple case sensitive 
string comparison throughout the stack and specs (so implementations 
remain simple and fast.)


Does anybody find the above disagreeable?

Best, and cheers for the reply Dave,

Nathan



Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Nathan

Alan Ruttenberg wrote:

On Wed, Jan 19, 2011 at 4:45 PM, Nathan  wrote:

David Wood wrote:

On Jan 19, 2011, at 10:59, Nathan wrote:


ps: as an illustration of how engrained URI normalization is, I've
capitalized the domain names in the to: and cc: fields, I do hope the mail
still come through, and hope that you'll accept this email as being sent to
you. Hopefully we'll also find this mail in the archives shortly at
htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ - Personally I'd
hope that any statements made using these URIs (asserted by man or machine)
would remain valid regardless of the (incorrect?-)casing.


Heh.  OK, I'll bite.  Domain names in email addressing are defined in IETF
RFC 2822 (and its predecessor RFC 822), which defers the interpretation to
RFC 1035 ("Domain names - implementation and specification).  RFC 1035
section 2.3.3 states that domain names in DNS, and therefore in (E)SMTP, are
to be compared in a case-insensitive manner.

As far as I know, the W3C specs do not so refer to RFC 1035.


And I'll bite in the other direction, why not treat URIs as URIs? why go
against both the RDF Specification [1] and the URI specification when they
say /not/ to encode permitted US-ASCII characters (like ~ %7E)? why force
case-sensitive matching on the scheme and domain on URIs matching the
generic syntax when the specs say must be compared case insensitively? and
so on and so forth.


[AR]
Which specs?


The various URI/IRI specs and previous revisions of.


http://www.w3.org/TR/REC-xml-names/#NSNameComparison

"URI references identifying namespaces

..

In a namespace declaration, the URI reference is

..

The URI references below are all different for the purposes of identifying
namespaces

..

The URI references below are also all different for the purposes of
identifying namespaces

..

So here is another spec that *explicitly* disagrees with the idea that URI
normalization should be a built-in processing.


As far as I can see, that's only for a URI reference used within a 
namespace, and does not govern usage or normalization when you join the 
URI reference up with the local name to make the full URI.


Out of interest, where is that process defined? I was looking for it the 
other day - for instance in the quoted specification we have the example:


units='Euro'>32.18


Where's the bit of the XML specification which says you join them up by 
concatenating 'http://ecommerce.example.org/schema' with #(?assumed?) 
and 'Euro' to get 'http://ecommerce.example.org/schema#Euro'?


And finally, this is why I specifically asked if the non-normalization 
of RDF URI References had XML Namespace heritage, which had then 
filtered down through OWL, SPARQL and RIF.



[AR] More to document, please: Which data is being junked and scrapped?


will document, but essentially every statement made using a non 
normalized URI when other statements are also being made about the 
"same" resource using normalized URIs - the two most common cases for 
this will be when people are using "CMS" systems and enter their domain 
name as uppercase in some admin, only to have that filter through to 
URIs in serialized RDF/RDFa, and where bugs in software have led to 
inconsistent URIs over time (for instance where % encoding has been 
fixed, or a :80 has been removed from a URI).



[AR] Hmm. Are you suggesting that the behavior of libraries and clients
should have precedence over specification? My view is that one first looks
to specifications, and then only if specifications are poor or do not speak
to the issue do we look at existing behavior.


Yes I am, that specification should standardize the behaviour of 
libraries and clients - the level of normalization in URIs published, 
consumed or used by these tools is often determined by non sem web stack 
components, and the sem web components are blocked from normalizing 
these should-not-be-differing-URIs by the sem web specifications.



[AR] I think there are many ways to lose in this scenario. For instance, if
the server redirects then the base is the last in the chain of redirects.
http://tools.ietf.org/html/rfc3986#page-29, 5.1.3. Base URI from the
Retrieval URI. My conclusion - don't engineer this way.


That would be my conclusion too, but as RDF(a) moves in to the realms of 
the CMS systems and out of the hands of the sem web community, it will 
be increasingly engineered this way, it's a very common pattern when 
working with (X)HTML (allows people to test locally or on dev servers 
without changing the content).



Further, essentially all RDFa ever encountered by a browser has the casing
on all URIs in href and src, and all these which are resolved, automatically
normalized - so even if you set the base to  or use it
in a URI, browser tools, extensions, and js based libraries will only ever
see the normalized URIs (and thus 

Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-19 Thread Nathan

David Wood wrote:

On Jan 19, 2011, at 10:59, Nathan wrote:

ps: as an illustration of how engrained URI normalization is, I've capitalized 
the domain names in the to: and cc: fields, I do hope the mail still come 
through, and hope that you'll accept this email as being sent to you. Hopefully 
we'll also find this mail in the archives shortly at 
htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ - Personally I'd hope 
that any statements made using these URIs (asserted by man or machine) would 
remain valid regardless of the (incorrect?-)casing.


Heh.  OK, I'll bite.  Domain names in email addressing are defined in IETF RFC 2822 
(and its predecessor RFC 822), which defers the interpretation to RFC 1035 
("Domain names - implementation and specification).  RFC 1035 section 2.3.3 
states that domain names in DNS, and therefore in (E)SMTP, are to be compared in a 
case-insensitive manner.

As far as I know, the W3C specs do not so refer to RFC 1035.


And I'll bite in the other direction, why not treat URIs as URIs? why go 
against both the RDF Specification [1] and the URI specification when 
they say /not/ to encode permitted US-ASCII characters (like ~ %7E)? why 
force case-sensitive matching on the scheme and domain on URIs matching 
the generic syntax when the specs say must be compared case 
insensitively? and so on and so forth.


I have to be honest, I can't see what good this is doing anybody, in 
fact it's the complete opposite scenario, where data is being junked and 
scrapped because we are ignoring the specifications which are designed 
to enable interoperability and limit unexpected behaviour.


I'm currently preparing a list of errors I'm finding in RDF, RDFa and 
linked data tooling to do with this, and I have to admit even I'm 
surprised at the sheer number of tools which are affected.


Additionally there's a very nasty, and common, use case which I can't 
test fully, so would appreciate people taking the time to check their 
own libraries/clients, as follows:


If you find some data with the following setup (example):

  @base  .
  <#t> x:rel <../baz> .

and then you "follow your nose" to , will you 
find any triples about it? (problem 1) and if there's no base on the 
second resource, and it uses relative URIs, then the base you'll be 
using is , and thus, you'll effectively create a 
new set of statements which the author never wrote, or intended (problem 2).


In other words, in this scenario, no matter what you do you're either 
going to get no data (even though it's there) or get a set of statements 
which were never said by the author (because the casing is different).


Further, essentially all RDFa ever encountered by a browser has the 
casing on all URIs in href and src, and all these which are resolved, 
automatically normalized - so even if you set the base to 
 or use it in a URI, browser tools, extensions, and 
js based libraries will only ever see the normalized URIs (and thus be 
incompatible with the rest of the RDF world).


I'll continue on getting the specific examples for current RDF tooling 
and resources and get it on the wiki, but I'll say now that almost every 
tool I've encountered so far "does it wrong" in inconsistent 
non-compatible ways.


Finally, I'll ask again, if anybody has any use case which benefits from 
 and <http://example.org/~foo> being classed 
as different RDF URIs, I'd love to hear it.


[1] """The encoding consists of: ... 2. %-escaping octets that do not 
correspond to permitted US-ASCII characters."""

 - http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref

Best,

Nathan



Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-19 Thread Nathan

Hi Alan,

Alan Ruttenberg wrote:

Nathan,

If you are going to make claims about the effect of other
specifications on RDF, could you please include pointers to the parts
of specifications that you are referring to, ideally with illustrative
examples of the problems you are? Absent that it is too difficult to
evaluate your claims.

The conversations on such topics too often devolve into serial opinion
dumping. If this is to be at all productive we need to be as precise
as possible.


Good idea :)

I'll create a new page on the wiki and add some examples over the next 
few days, then reply with a pointer later in the week.


ps: as an illustration of how engrained URI normalization is, I've 
capitalized the domain names in the to: and cc: fields, I do hope the 
mail still come through, and hope that you'll accept this email as being 
sent to you. Hopefully we'll also find this mail in the archives shortly 
at htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ - Personally 
I'd hope that any statements made using these URIs (asserted by man or 
machine) would remain valid regardless of the (incorrect?-)casing.


Best,

Nathan



Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-19 Thread Nathan

Dave Reynolds wrote:

On 19/01/2011 3:55 AM, Alan Ruttenberg wrote:


The information on how to fully determine equivalence according to the
URI spec is distributed across a wide and growing number of different
specifications (because it is schema dependent) and could, in
principle, change over time. Because of the distributed nature of the
information it is not feasible to fully implement these rules.
Optionally implementing these rules (each implementor choosing where
on the ladder they want to be) would mean that documents written in
RDF (and derivative languages) would be interpreted differently by
different implementations, which is an unacceptable feature of
languages designed for unambiguous communication. The fact that the
set of rules is growing and possibly changing would lead to a similar
situation - documents that meant one thing at one time could mean
different things later, which is also unacceptable, for the same
reason.


Well put, I meant to point out the implications of scheme-dependence and 
you've covered it very clearly.


Whilst I share the same end goal, I have to stress that *several 
important factors have been omitted*.


The semantic web specifications are not the only ones which affect 
interoperability and compatibility with regard to URIs. Many (most) RDF 
serializations include the use of relative URIs, are affected by base 
mechanisms which are defined by the URIs RFC, dependent on the protocol, 
and by base mechanisms provided by host serialization languages, and 
each of the respective implementations thereof. This covers everything 
from implementations of the http protocol on clients, servers and 
intermediaries, through to implementations of the DOM in XML tooling, 
HTML tooling and the major browsers. It also covers every potential 
component which provides URI support, from open source libraries and 
classes through embedded support in black box applications.


Every single one of the aforementioned are free to (silently) implement 
any of the URI normalization techniques in the URI/IRI RFCs. Each 
implementer of these specifications chooses where on the ladder they 
want to be, and that decision affects & often determines the URIs seen 
by implementations of the semantic web specifications.


These factors cannot be ignored, and they are the factors which the RDF 
specification and semantic web specifications must strive to be 
compatible with, and to normalize the actions of.


Every additional step on the ladder added as a requirement to the RDF 
specification is a step closer to interoperability and compatibility.



David (Wood) clarifies (surprisingly to me as well) that the issue of
normalization could be addressed by the working group. I expect,
however, that any proposed change would quickly be determined to be
counter to the instructions given in the charter on Compatibility and
Deployment Expectation, and if not, would be rejected after justified
objections on this basis from reviewers outside the working group.


+1


As per the above, I'd expect the polar opposite.

+1 to compatibility (with the real, deployed, web - the one we all use)

Best,

Nathan



Re: Property for linking from a graph to HTTP connection meta-data?

2011-01-17 Thread Nathan

William Waites wrote:

* [2011-01-17 16:39:27 +0100] Martin Hepp  
écrit:

] Does anybody know of a standard property for linking a RDF graph to a  
] http:GetRequest, http:Connection, or http:Response instance? Maybe  
] rdfs:seeAlso (@TBL: ;- ))?


If you suppose that the name of the graph is the same as the
request URI (it will not always be, of course) you can link
in the other direction from http:Request using http:requestURI.
I am not sure that http:requestURI has a standard inverse though.


And remember of course, that the headers are split in to different 
groups which relate to different things, many relate to the message (in 
relation to the request), some relate to the server, some relate to the 
entity (an encoded version of the representation for messaging) a few 
(really not many) relate to the representation itself, and a couple 
relate to the resource itself, the resource being the thing the URI 
identifies.


Best,

Nathan



Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-17 Thread Nathan

Dave Reynolds wrote:
On Mon, 2011-01-17 at 16:52 +, Nathan wrote: 
I'd suggest that it's a little more complex than that, and that this may 
be an issue to clear up in the next RDF WG (it's on the charter I believe).


I beg to differ.

The charter does state: 


"Clarify the usage of IRI references for RDF resources, e.g., per SPARQL
Query §1.2.4."

However, I was under the impression that was simply removing the small
difference between "RDF URI References" and the IRI spec (that they had
anticipated). Specifically I thought the only substantive issue there
was the treatment of space and many RDF processors already take the
conservation position on that anyway.


Likewise, apologies as I should have picked my choice of words more 
appropriately, I intended to say that the usage of IRI references was up 
for clarification, and if normalization were deemed an issue then the 
RDF WG may be the place to raise such an issue, and address if needed.


As for RIF and GRDDL, can anybody point me to the reasons why 
normalization are not performed, does this have xmlns heritage?


Best,

Nathan



Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-17 Thread Nathan

Nuno Bettencourt wrote:

Hi,

The doubt just kept on because in all protocols we were still referring to the 
same URN.


do you mean that there were RDF statements which linked each of the 
protocol specific URIs to a single URN via the same property? eg:


   x:foo 
   x:foo 
   x:foo 

If so, then you could define the property (x:foo above) as an Inverse 
Functional Property which would take care of the sameness for you.


Best,

Nathan



Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-17 Thread Nathan

Nuno Bettencourt wrote:
Hi, 


Even though I'll be deviating the point just a bit, since we're discussing URI 
comparison in terms of RDF, I would like to request some help.

I have a doubt about URLs when it comes to RDF URI comparison. Is there any RFC that establishes if 


http://abc.com:80/~smith/home.html
https://abc.com:80/~smith/home.html
or even
ftp://abc.com:80/~smith/home.html
 
should or not be considered the same resource?


No, and no such rules can be written (as they are case specific, and all 
the above URIs could easily, and often do, point to differing resources) 
- if all URIs point to the same resource then it should be stated as 
such by some other means, which in RDF would mean owl:sameas.


Best,

Nathan



Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-17 Thread Nathan

Better be a bit more specific.. in-line..

Nathan wrote:

Kingsley Idehen wrote:

On 1/17/11 10:51 AM, Martin Hepp wrote:

Dear all:

RFC 2616 [1, section 3.2.3] says that

"When comparing two URIs to decide if they match or not, a client  
SHOULD use a case-sensitive octet-by-octet comparison of the entire

   URIs, with these exceptions:

  - A port that is empty or not given is equivalent to the default
port for that URI-reference;
  - Comparisons of host names MUST be case-insensitive;
  - Comparisons of scheme names MUST be case-insensitive;
  - An empty abs_path is equivalent to an abs_path of "/".

   Characters other than those in the "reserved" and "unsafe" sets (see
   RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.

   For example, the following three URIs are equivalent:

  http://abc.com:80/~smith/home.html
  http://ABC.com/%7Esmith/home.html
  http://ABC.com:/%7esmith/home.html


As per the percent encoding rules and the set of unreserved characters 
[1], percent encoded octets in certain ranges (see [1]) should not be 
created by URI producers, and when found in a URI should be decoded 
correctly, this includes %7E - also percent encoding is case insensitive 
so %7e and %7E are equivalent, thus you should not produce URIs like 
this, and when found you should fix the error, to produce:


   http://abc.com:80/~smith/home.html
   http://ABC.com/~smith/home.html
   http://ABC.com:/~smith/home.html

The above URIs all use the generic syntax, so the generic component 
syntax equivalence rules always apply [2], so normalization after these 
rules would produce:


   http://abc.com:80/~smith/home.html
   http://abc.com/~smith/home.html
   http://abc.com:/~smith/home.html

Then finally, scheme specific normalization rules can be applied which 
treat all the port values as being equivalent (for the purpose of naming 
and dereferencing, it's the specification for URIs with that scheme), 
which allows you to normalize to:


   http://abc.com/~smith/home.html
   http://abc.com/~smith/home.html
   http://abc.com/~smith/home.html

[1] http://tools.ietf.org/html/rfc3986#section-6.2.2.1
[2] http://tools.ietf.org/html/rfc3986#section-2.3
[3] http://tools.ietf.org/html/rfc3986#section-6.2.3

Hope that helps refine my previous comments,


Does this also hold for identifying RDF resources


Yes, where an RDF resource is a Data Container at an Address (URL). 
Thus, equivalent results for de-referencing a URL en route to 
accessing data.


No, when "resource" also implies an Entity (Data Item or Data Object) 
that is assigned a Name via URI.


Logically, yes on both counts, we should/could be normalizing these URIs 
as we consume and publish using the syntax based normalization rules [1] 
which apply to all URI/IRIs with the generic syntax (such as the 
examples above)


Any client consuming data, or server publishing data, can use the 
normalization rules, so it stands to reason that it's pretty important 
that we all do it to avoid false negatives.


[1] http://tools.ietf.org/html/rfc3986#section-6.2.2

Best,

Nathan






Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-17 Thread Nathan

Kingsley Idehen wrote:

On 1/17/11 10:51 AM, Martin Hepp wrote:

Dear all:

RFC 2616 [1, section 3.2.3] says that

"When comparing two URIs to decide if they match or not, a client  
SHOULD use a case-sensitive octet-by-octet comparison of the entire

   URIs, with these exceptions:

  - A port that is empty or not given is equivalent to the default
port for that URI-reference;
  - Comparisons of host names MUST be case-insensitive;
  - Comparisons of scheme names MUST be case-insensitive;
  - An empty abs_path is equivalent to an abs_path of "/".

   Characters other than those in the "reserved" and "unsafe" sets (see
   RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.

   For example, the following three URIs are equivalent:

  http://abc.com:80/~smith/home.html
  http://ABC.com/%7Esmith/home.html
  http://ABC.com:/%7esmith/home.html
"

Does this also hold for identifying RDF resources


Yes, where an RDF resource is a Data Container at an Address (URL). 
Thus, equivalent results for de-referencing a URL en route to accessing 
data.


No, when "resource" also implies an Entity (Data Item or Data Object) 
that is assigned a Name via URI.


Logically, yes on both counts, we should/could be normalizing these URIs 
as we consume and publish using the syntax based normalization rules [1] 
which apply to all URI/IRIs with the generic syntax (such as the 
examples above)


Any client consuming data, or server publishing data, can use the 
normalization rules, so it stands to reason that it's pretty important 
that we all do it to avoid false negatives.


[1] http://tools.ietf.org/html/rfc3986#section-6.2.2

Best,

Nathan



Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-17 Thread Nathan

Dave Reynolds wrote:
On Mon, 2011-01-17 at 16:51 +0100, Martin Hepp wrote: 

Dear all:

RFC 2616 [1, section 3.2.3] says that

"When comparing two URIs to decide if they match or not, a client   
SHOULD use a case-sensitive octet-by-octet comparison of the entire

URIs, with these exceptions:

   - A port that is empty or not given is equivalent to the default
 port for that URI-reference;
   - Comparisons of host names MUST be case-insensitive;
   - Comparisons of scheme names MUST be case-insensitive;
   - An empty abs_path is equivalent to an abs_path of "/".

Characters other than those in the "reserved" and "unsafe" sets (see
RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.

For example, the following three URIs are equivalent:

   http://abc.com:80/~smith/home.html
   http://ABC.com/%7Esmith/home.html
   http://ABC.com:/%7esmith/home.html
"

Does this also hold for identifying RDF resources

a) in theory and


No. RDF Concepts defines equality of RDF URI References [1] as simply
character-by-character equality of the %-encoded UTF-8 Unicode strings.

Note the final Note in that section:

"""
Note: Because of the risk of confusion between RDF URI references that
would be equivalent if derefenced, the use of %-escaped characters in
RDF URI references is strongly discouraged. 
"""


which explicitly calls out the difference between URI equivalence
(dereference to the same resource) and RDF URI Reference equality.


I'd suggest that it's a little more complex than that, and that this may 
be an issue to clear up in the next RDF WG (it's on the charter I believe).


For example:

   When a URI uses components of the generic syntax, the component
   syntax equivalence rules always apply; namely, that the scheme and
   host are case-insensitive and therefore should be normalized to
   lowercase.  For example, the URI  is
   equivalent to <http://www.example.com/>.

- http://tools.ietf.org/html/rfc3986#section-6.2.2.1

However, that's only for URIs which use the generic syntax (which most 
URIs we ever touch do use).


It would be great if a normalized-IRI with specific normalization rules 
could be drafted up as part of the next WG charter - after all they are 
a pretty pivotal part of the sem web setup, and it would be relatively 
easy to clear up these issues.


Best,

Nathan



Re: Semantics of rdfs:seeAlso (Was: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?)

2011-01-13 Thread Nathan

Kingsley Idehen wrote:

On 1/13/11 12:04 PM, Nathan wrote:

Hi Kinglsey,

Kingsley Idehen wrote:
When our engine describes entities it can publish these descriptions 
using variety of structured data formats that include RDF. The same 
thing applies on the data consumption side. Basically, RDF formats 
are options re. Linked Data (the concept).


A generic problem here, when using non RDF types with Linked Data over 
HTTP, is that there's currently no way to indicate that a resource 
is/has a set of machine readable "linked data" variants, in many cases 
it is useful to publish and consume with linked data in CSV format and 
related (as you well note) - but without prior out of band knowledge 
that the representation contains, or is, linked data, the machines are 
pretty much screwed. Typically the RDF variants don't have this 
problem because the media type sets the expectation, so you can conneg 
on an RDF type and know your getting back "linked data", you can't do 
this with CSV and related with any expectation that you'll get back 
"linked data" - thus, if there was some way to mark the set of 
representations given upon dereferencing a URI as linked data, 
containing rdf, rdfable 3 tuples, or a view thereof, it'd be a lot 
friendlier to the web of data in general.


So what happens to RDFa in (X)HTML? Even worse, no DOCTYPE declarations?
What about various JSON dialects for Linked Data graphs?
How about N-Triples? Ditto TriX and others?


Probably wasn't clear, I'm saying there needs to be something (for 
instance a new header) which indicates that the representation contains 
"linked data", then you machines could automatically throw the CSV 
through a csv-linked-data parser and it'd work, likewise every type you 
mentioned above.


The problem here isn't the different types of media, the problem here is

(1) internet media types are dire and badly need re-looked at

(2) there's no information provided to machine so that it has a hope in 
hell of understanding one of these other variants (lest it has it's own 
special mediatype)


Fix that and the door is opened to all of the above.

 - RDFa needs an indicator at HTTP Message level to say it's "html+rdfa"
 - JSON dialects need standardized (coming soon to a WG near you) w/ 
media type registered / well-known

 - N-Triples needs it's own media type (doesn't have one)
 - and so on..

Typically we need a machine to not only ask "Accept: something/rdf" but 
to effectively ask "if linked data give me JSON" (swap json for csv, 
turtle, rdf+xml, whatever)


Best,

Nathan



Re: Semantics of rdfs:seeAlso (Was: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?)

2011-01-13 Thread Nathan

Hi Kinglsey,

Kingsley Idehen wrote:
When our engine describes entities it can publish these descriptions 
using variety of structured data formats that include RDF. The same 
thing applies on the data consumption side. Basically, RDF formats are 
options re. Linked Data (the concept).


A generic problem here, when using non RDF types with Linked Data over 
HTTP, is that there's currently no way to indicate that a resource 
is/has a set of machine readable "linked data" variants, in many cases 
it is useful to publish and consume with linked data in CSV format and 
related (as you well note) - but without prior out of band knowledge 
that the representation contains, or is, linked data, the machines are 
pretty much screwed. Typically the RDF variants don't have this problem 
because the media type sets the expectation, so you can conneg on an RDF 
type and know your getting back "linked data", you can't do this with 
CSV and related with any expectation that you'll get back "linked data" 
- thus, if there was some way to mark the set of representations given 
upon dereferencing a URI as linked data, containing rdf, rdfable 3 
tuples, or a view thereof, it'd be a lot friendlier to the web of data 
in general.


A typical approach would be to register new mediatypes, +variant kinds, 
for instance text/rdf+csv or such like, but these types wouldn't be well 
known throughout the internet, served correctly by default in the likes 
of apache, or handed off to the correct consuming programs by user 
agents - I'll leave it there, without a proposal, but some indication to 
the machine would/will be needed to make this approach friendlier for 
the web.


and as an aside: I do worry a little that there may be some overloading 
of terms going on here, Linked Data (the concept) and Linked Data (the 
protocol) - I'm unsure exactly how to define Linked Data (the concept) 
but assuming you're referring to a broad range of EAV variant 3-Tuple 
based data with URIs.


Best,

Nathan



Re: Semantics of rdfs:seeAlso (Was: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?)

2011-01-13 Thread Nathan

Dave Reynolds wrote:

On Thu, 2011-01-13 at 06:29 -0500, Tim Berners-Lee wrote:

One *can* argue that the RDFS spec is definitive, and it is very loose in its 
definition.


Loose in the sense of allowing a range of values but as a specification
it is unambiguous in this case, as Martin has already pointed out:

"When such representations may be retrieved, no constraints are placed
on the format of those representations."


I'd suggest that the intended meaning is very ambiguous, primarily 
because it uses overloaded terms, the primary question is whether 
rdfs:seeAlso points to a resource (in the semweb sense, something named 
with a URI) which you are looking for statements about, or whether 
rdfs:seeAlso points to a resource (in the restweb sense, something named 
with a dereferencable URI giving access to a set of representations when 
dereferenced) which generically may have something to do with the subject.


In one case there's a built in expectation of RDF statements, and thus 
the meaning of "no constraints are placed on the format of those 
representations" is naturally constrained to the set of data formats 
which can contain RDF statements - and in the other case it's 
interpretation is wide open to conflicting usage as illustrated by this 
chain of emails.


Best,

Nathan



Re: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?, existing predicate for linking to PDF?

2011-01-13 Thread Nathan

Dave Reynolds wrote:

On Thu, 2011-01-13 at 11:43 +, Nathan wrote:

"linked data" is not some term for data with links, it's an engineered 
protocol which has constraints and requirements to make the whole thing 
work.


Where is the spec for this "engineered protocol" and where in that spec
does it redefine rdfs:seeAlso?

[I believe I have reasonably decent understanding of, and experience
with, linked data. It is a useful set of conventions and practices
building on some underlying formal specifications. However, I'm not
aware of those practices being so universally agreed and formally
codified as to justify some of the claims being made in this thread.]


Hence my comment:

"Perhaps this points to a need to standardize Linked Data as a protocol."

Best,

Nathan



Re: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?, existing predicate for linking to PDF?

2011-01-13 Thread Nathan

wow, typo's to the point of being incomprehensible! fixed:

Nathan wrote:

Martin Hepp wrote:

Hi Nathan:

There are other ways of looking at this, remembering we're in the 
realm of machine readable information and the context of RDF. 
rdfs:seeAlso is used to indicate a resource O which may provide 
additional information about the resource S - information in this 
context being rdf, information for the machine - so we can say that 
if O points to a resource that doesn't contain any information at all 
(no rdf, or isn't the subject of any statements) then we've created a 
meaningless statement, it may as well be { S rdfs:seeAlso [] }


One could easily suggest that it's good for RDF Schema properties to 
have some use in RDF, and thus that if rdfs:seeAlso is used in a 
statement, that it should point to some "information", some rdf for 
the machine, otherwise it's a bit of a pointless property.


Given the above, we could take the meaning of the sentence "no 
constraints are placed on the format of those representations" and 
assert that this simply means that RDF/XML is not required, and that 
any RDF format can be used.


I don't buy in to restricting the meaning of "data" in the context of 
RDF to "RDF data". If the subject or object of RDF triples can be any 
Web resource (information and non-information resource), then the 
range of rdfs:seeAlso should also include information resources (i.e., 
data) of a variety of conceptual and syntactic forms.



The "data" part of "linked data" is not generic, machine accessible != 
machine understandable, and that's what this is all about.


"linked data" is not some term for data with links, it's an engineered 
protocol which has constraints and requirements to make the whole thing 
work.


We cannot build a web of data (machine understandable dereferencable 
data) without these constraints.


And PDF, HTML without RDFa as well as images clearly qualify as data. 
They are also clearly machine-accessible. If you are still not 
convinced: What about CSV files or text files containing ACE 
(controlled English), or OData / GData?


I'm far from convinced, and have discussed this at length w/ Kingsley 
and others.


A three column CSV is not linked data, yes you can take linked data and 
format it in a 3 column CSV, and yes with some out of band knowledge 
about a particular CSV you can /convert it to/ to RDF, this is not true 
for /all/ csv files and only ever works if you have prior knowledge of 
the particular file being considered - that is to say, we can't build a 
web of data by publishing csv files, or traverse a web of data by 
setting our Accept headers to "text/csv" and hoping that any data 
received matches our three column constraints (and hoping again when it 
does that it actually is something we can use and not just "x x x"). The 
same is true of text files containing ACE.


As for OData and GData, sure it is more linky, and looks more like RDF, 
but it's not, and the rabbit hole runs much deeper with these two, but 
essentially it's the difference between people making statements with

open world semantics and people having RDF gleaned from data they've put
out which may take on a different meaning when in RDF, other than that
which was intended.

Ultimately, a big part of the linked data protocol is having machine 
readable and understandable data in negotiable well defined formats 
available at dereferencable http and https scheme URIs - if you drop any 
one of those elements it's simply not "linked data"


Perhaps this points to a need to standardize Linked Data as a protocol.

Best,

Nathan






Re: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?, existing predicate for linking to PDF?

2011-01-13 Thread Nathan

Martin Hepp wrote:

Hi Nathan:

There are other ways of looking at this, remembering we're in the 
realm of machine readable information and the context of RDF. 
rdfs:seeAlso is used to indicate a resource O which may provide 
additional information about the resource S - information in this 
context being rdf, information for the machine - so we can say that if 
O points to a resource that doesn't contain any information at all (no 
rdf, or isn't the subject of any statements) then we've created a 
meaningless statement, it may as well be { S rdfs:seeAlso [] }


One could easily suggest that it's good for RDF Schema properties to 
have some use in RDF, and thus that if rdfs:seeAlso is used in a 
statement, that it should point to some "information", some rdf for 
the machine, otherwise it's a bit of a pointless property.


Given the above, we could take the meaning of the sentence "no 
constraints are placed on the format of those representations" and 
assert that this simply means that RDF/XML is not required, and that 
any RDF format can be used.


I don't buy in to restricting the meaning of "data" in the context of 
RDF to "RDF data". If the subject or object of RDF triples can be any 
Web resource (information and non-information resource), then the range 
of rdfs:seeAlso should also include information resources (i.e., data) 
of a variety of conceptual and syntactic forms.



The "data" part of "linked data" is not generic, machine accessible != 
machine understandable, and that's what this is all about.


"linked data" is not some term for data with links, it's an engineered 
protocol which has constraints and requirements to make the whole thing 
work.


We cannot build a web of data (machine understandable dereferencable 
data) without these constraints.


And PDF, HTML without RDFa as well as images clearly qualify as data. 
They are also clearly machine-accessible. If you are still not 
convinced: What about CSV files or text files containing ACE (controlled 
English), or OData / GData?


I'm far from convinced, and have discussed this at length w/ Kingsley 
and others.


A three column CSV is not linked data, yes you can take linked data and 
format it in a 3 column CSV, and yes with some out of band knowledge 
about a particular CSV you can /convert it to/ to RDF, this is not true 
for /all/ csv files and only ever works if you have prior knowledge of 
the particular file being considered - that is to say, we can't build a 
web of data by publishing csv files, or traverse a web of data by 
setting our Accept headers to "text/csv" and hoping that any data 
received matches our three column constrains (and hoping again when it 
does that it actually is something we can use and not just "x x x"). The 
same is true of text files containing ACE.


As for OData and GData, sure it is more linky, and looks more like RDF, 
but it's not, and the rabbit hole runs much deeper with these two but 
essentially it's the difference between people making open world 
semantics and making statements they intended to make, and people having 
RDF gleaned from data they've put out which may take on a different 
meaning when in RDF other than that which was intended.


Ultimately, a big part of the linked data protocol is having machine 
readable and understandable data in negotiable well defined formats 
available at dereferencable http and https scheme URIs - if you drop any 
one of those elements it's simply not "linked data"


Perhaps this points to a need to standardize Linked Data as a protocol.

Best,

Nathan



Re: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?, existing predicate for linking to PDF?

2011-01-13 Thread Nathan

Phil Archer wrote:
Martin seems to be fighting a lone battle, but fwiw I'll add my +1 to 
his comments.


I do take the point that, in context, it's really nice if rdfs:seeAlso 
gives a URI that provides more data in RDF and many applications will 
make that assumption. But to /rely/ on that every time seems at odds 
with the, AIUI fundamental notion, that a URI is an identifier and no more.


I'd say that if you see an rdfs:seeAlso property, sure, send an HTTP 
request, but do it with a suitable accept header. If you get a 200, 
great, add the data, but be ready to deal with a 406 (I've got it but 
not in the format you have specified in your request).


Describing a URI with further triples is good, nothing wrong with that, 
but to use that to decide whether or not to dereference an rdfs:seeAlso 
URI means looking for a description of the linked resource and then 
acting accordingly. That sounds like a relatively heavy bit of 
processing that HTTP kind of takes care of for you.


So then if you use { S rdfs:seeAlso O } then O is used with a 
dereferencable http/https scheme URI which deferences to an information 
resource with a representation in any format which may or may not have 
something to do with the subject of the triple.


Apologies, previously I'd thought the O was a name which you'd look for 
in the subject position of other statements. (as in, any RDF URI 
Reference, any scheme, or any blank node).


Best,

Nathan



Re: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?, existing predicate for linking to PDF?

2011-01-12 Thread Nathan

Hi Martin,

Martin Hepp wrote:
For my taste, using rdfs:seeAlso is perfectly valid (yet suboptimal, 
because too unspecific), according to the RDFS spec:


http://www.w3.org/TR/rdf-schema/#ch_seealso

Quote: "rdfs:seeAlso is an instance of rdf:Property that is used to 
indicate a resource that might provideadditional information about 
the subject resource.


A triple of the form:

S rdfs:seeAlso O

states that the resource O may provide additional information about S. 
It may be possible to retrieve representations of O from the Web, but 
this is not required. When such representations may be retrieved, ***no 
constraints are placed on the format of those representations***."


There are other ways of looking at this, remembering we're in the realm 
of machine readable information and the context of RDF. rdfs:seeAlso is 
used to indicate a resource O which may provide additional information 
about the resource S - information in this context being rdf, 
information for the machine - so we can say that if O points to a 
resource that doesn't contain any information at all (no rdf, or isn't 
the subject of any statements) then we've created a meaningless 
statement, it may as well be { S rdfs:seeAlso [] }


One could easily suggest that it's good for RDF Schema properties to 
have some use in RDF, and thus that if rdfs:seeAlso is used in a 
statement, that it should point to some "information", some rdf for the 
machine, otherwise it's a bit of a pointless property.


Given the above, we could take the meaning of the sentence "no 
constraints are placed on the format of those representations" and 
assert that this simply means that RDF/XML is not required, and that any 
RDF format can be used.


Generally it appears to me that rdfs:seeAlso is a property for a machine 
to follow in order to get more information, and that much of the usage 
mentioned in this thread requires a property which informs a human that 
they may want to check resource O for more information - essentially 
something similar to a hyperlink in a html document with no @rel value.


Best,

Nathan



Reverse Links and HTTP Referer

2011-01-11 Thread Nathan

Hi All,

Two little questions for you all,

1: does anybody send HTTP Referer headers from their linked data clients 
and crawlers?
 - for instance if dereferencing an IRI in the object position of a 
triple, send the subject - and likewise for the other direction.


2: has anybody configured their linked data servers to read the value of 
the Referer header in order to find backlinks, dereference the value, or 
store the reverse links?


ps: apologies for the cc to semantic web, looking for answers from 
anybody who may have taken this approach at any time, not just recently.


Best,

Nathan



Re: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?, existing predicate for linking to PDF?

2011-01-10 Thread Nathan

Phil Archer wrote:
On the Web in general, URIs don't, or certainly shouldn't, imply any 
particular content type.


They don't imply anything, they name things, and the thing that's named 
can by all means be a representation with a specific mediatype, infact 
this is by far the most common usage of URIs, and always has been.







Re: Possible Idea For a Sem Web Based Game?

2010-12-10 Thread Nathan

Melvin Carvalho wrote:

On 21 November 2010 18:12, Toby Inkster  wrote:

On Sun, 21 Nov 2010 20:43:34 +0800
Joshua Shinavier  wrote:


1) a "node" should not be *only* a location, but should also include a
game-specific context. E.g. instead of a node for "London", have a
node for "running from zombies in London", with a geo:location link to
the DBpedia resource for London.

Yep, that's certainly the idea. A node is equivalent to a page in the
CYOA books; not just a physical location. A node may in fact describe a
long journey and so describe many locations.

The fact that my test nodes correspond with locations is entirely a
consequence of the lack of effort and imagination I put into them.


Real world locations might be the way forward.

http://www.youtube.com/watch?v=NMQ5DFkU794

Is this linked data done right?


IMHO it's half the point of linked data, for the internet of things and 
augmented reality to come together properly requires an open, meshed, 
universal data structure that's distributed and not silo based, there's 
simply no way to augment the real world with digital data from a single 
source or silo - thus, in many respects, that video and game is, imo, 
linked data /consumed/ correctly, and can only be done with linked data.


Best,

Nathan



Re: Is 303 really necessary?

2010-11-26 Thread Nathan

Bob Ferris wrote:

Hello everybody,

I wrote a note as an attempt to clarify a bit the terms Resource, 
Information Resource and Document and their relations (from my point of 
view). Maybe this helps to figure out the bits of the current confusion. 
Please have a look at:


http://infoserviceonto.wordpress.com/2010/11/25/on-resources-information-resources-and-documents/ 


likewise:

  http://webr3.org/apps/notes/web#atypedlink

partial story of creating /a/ semantic web, rather than /the/ semantic web.



Re: Possible Idea For a Sem Web Based Game?

2010-11-20 Thread Nathan
Just a quick thought, when there's already a full web-of-data out there, 
why not use /real/ things in the game, real images, locations and so on?



Melvin Carvalho wrote:

On 20 November 2010 20:13, Toby Inkster  wrote:

On Sat, 20 Nov 2010 18:28:24 +0100
Melvin Carvalho  wrote:


1.  Would each 'location' be a document or a resource?  Web of
Documents vs Web of Resources?

2.  Could we use foaf:image and dcterms:desc for the game pages?

3.  How would you model the link on each page?

Sounds like a pretty awesome idea. I'd personally model it like this:

   <#node1>
   a game:Node ;
   foaf:name "Dark Cave" ;
   foaf:depiction <...> ;
   dcterms:description "..." .

I'd say that game:Node is not disjoint with foaf:Document. That gives
you flexibility - in some cases a node might be a page, and in other
cases you might have several nodes described on the same page.

Links to other places could be accomplished using:

   <#node1>
   game:north <#node2> ;
   game:south  ;
   game:east  .

The description itself would have more detailed descriptions of the
directions like "To the south lies a desolate wasteland.". Directions
you'd want would probably be eight compass, points plus "up", "down",
"inside", "outside".

Each node should probably also have co-ordinates (not in WGS84, but a
made-up co-ordinate system), along the lines of:

   <#node1>
   game:latitude 123 ;
   game:longitude -45 .

This would not be used for gameplay, but to aid authoring new nodes.
You'd want to have your "north" triple link to a node that you could
plausibly reach by going a short distance north.


Hi Toby

Thanks for the detailed reply.  That sounds excellent, exactly what I
was looking form

In fact compass directions (plus up and down) bring add a lot to the equation.

However most book based text games will have a description added to
each link, rather than simply directions to travel.  Here's a quick
example from googling 'choose your own adventure' :

http://www.iamcal.com/games/choose/room.php

I think it would be possible to bootstrap some existing stories to the
model if we could expand the idea of game:north to have a link and a
description, in this way, which I was wondering about.

Longer term, I think it would be great to start a simple adventure
game in the classic style of 'the hobbit' or 'hitchikers guide'
text/graphics based adventures from the 80s, however with the twist
that game worlds could link to multiple servers, across the web,
allowing anyone to make a 'game withing a game', or web of games.
However, that's probably a greater modeling task, so I wanted to start
more simply to begin with.

So something like
<#action1>
a game:Action
dcterms:description "Jump on the Barrel"
game:destination 

I'm pretty new to modeling this stuff, so not sure how much sense that makes?


I'm not sure how the rendering would work, but perhaps it's easy
enough in RDFa once we have a model.

I'd be happy to mock-up an interface - perhaps tonight!

--
Toby A Inkster













Re: Is 303 really necessary?

2010-11-19 Thread Nathan

Kingsley Idehen wrote:

On 11/19/10 4:55 PM, David Booth wrote:

On Fri, 2010-11-19 at 07:26 -0500, Kingsley Idehen wrote:
[ . . . ]

To conclude, I am saying:

1. No new HTTP response codes
2. Web Servers continue to return 200 OK for Document URLs
3. Linked Data Servers have option handle Name or Address
disambiguation using 303 redirection for slash URIs
4. Linked Data Servers have option to be like Web Servers i.e. do no
Name or Address disambiguation leaving Linked Data aware user agents
to understand the content of Description Documents
5. Linked Data aware User Agents handle Name or Address
disambiguation.

IMHO: when the dust settles, this is what it boils down to. On our
side, we're done re. 1-5 across our Linked Data server and client
functionality, as delivered by our products :-)


I think the above reflects reality, regardless of what is recommended,
because:

  - some Linked Data Servers *will* serve RDF with 200 response codes via
slash URIs, regardless of what is recommended;

  - some User Agents *will* still try to use that data;

  - those User Agents may or may not care about the ambiguity between the
toucan and its web page;

  - those that do care will use whatever heuristics they have to
disambiguate, and the heuristic of ignoring the 200 response code is
very pragmatic.


David,

Great! We're going to point back to this post repeatedly in the future :-)


I truly hope not, recognizing that some people *will* do whatever the 
hell they please, doesn't make what they're doing a good idea, or 
something that should be accepted as best / standard practise.


As David mentioned earlier, having two ways to do things is already bad 
enough (hash/303) without introducing a third. There's already been half 
a decade of problems/ambiguity/nuisance because of the httpRange-14 
resolution, ranging from technical to community and via conceptual, why 
on earth would we want to compound that by returning to the messy state 
that prompted the range-14 issue in the first place?


Fact is, the current reality is primarily due to the fact there is so 
much confusion with no single clear message coming through, and until 
that happens the future reality is only likely to get messier.


Best,

Nathan



Re: Making Linked Data Fun

2010-11-19 Thread Nathan

Kingsley Idehen wrote:

All,

Here is an example of what can be achieved with Linked Data, for 
instance using BBC Wild Life Finder's data:


1. http://uriburner.com/c/DI463N -- remote SPARQL queries between two 
instances (URIBurner and LOD Cloud Cache) with results serialized in 
CXML (image processing part of the SPARQL query pipeline) .


Enjoy!


Sweet you did it! Pivot and Linked Data joined together, that's a big 
step, congrats!


kutgw,

Nathan




Re: Rough draft poem: Document what art thou?

2010-11-11 Thread Nathan

Didn't they make a film called "The Descriptor!" out of that?

Kingsley Idehen wrote:


I am the Data Container, Disseminator, and Canvas.
I came to be when the cognitive skills of mankind deemed oral history 
inadequate.
I am transcendent, I take many forms, but my core purpose is constant - 
Container, Disseminator, and Canvas.
I am dexterous, I can be blank, partitioned horizontally, horizontally 
and vertically, and if you get me excited and I'll show you fractals.

I am accessible in a number of ways, across a plethora of media.
I am loose, so you can access my content too.
I am loose in a cool way, so you can refer to me independent of my content.
I am cool in a loose way, so you can refer to my content independent of me.
I am even cool and loose enough to let you figure out stuff from my 
content including how its totally distinct from me.

But...
I am possessive about my coolness, so all Containment, Dissemination, 
and Canvas requirements must first call upon me, wherever I might be.

So...
If you postulate about my demise or irrelevance across any medium, I 
will punish you with confusion!

Remember...
I just told you who I am, when something tells you what it is, and it is 
as powerful as I, best you believe it :-)







Re: Role of URI and HTTP in Linked Data

2010-11-11 Thread Nathan

Kingsley Idehen wrote:

On 11/11/10 10:00 AM, Nathan wrote:

Kingsley Idehen wrote:

On 11/11/10 9:00 AM, David Booth wrote:

On Thu, 2010-11-11 at 07:23 +0100, Jiří Procházka wrote:
[ . . . ]

I think it is flawed trying to enforce "URI == 1 thing"

Exactly right.  The "URI == 1 thing" notion is myth #1 in "Resource
Identity and Semantic Extensions: Making Sense of Ambiguity":
http://dbooth.org/2010/ambiguity/paper.html#myth1
It is a good *goal*, but it is inherently unachievable.


Are you implying that a URI -- an Identifier -- doesn't have a 
Referent (singular)?


http://kingsley.idehen.name/dataspace/person/kidehen#this does not 
name you, it's not a name for you, or the name for you.


It's a name (identifier for the purpose of referencing) of "#this, as 
described by http://kingsley.idehen.name/dataspace/person/kidehen"; and 
how "#this, as described by 
http://kingsley.idehen.name/dataspace/person/kidehen"; is ultimately 
interpreted to be, depends entirely on context and application.


> If so, what is the URI identifying?

It's identifying, or referring to, "x, as described by y" and what the 
description identifies is open to interpretation and context (a human? 
an agent? a father? a trusted-man? a holder of X? a bearer of Y?).

Nathan,

In your response, I don't sense (in any way) the plurality that I sense 
in David's comments -- for which I sought clarification.


I interpret David's response (maybe inaccurately) as saying:
http://kingsley.idehen.name/dataspace/person/kidehen#this, isA URI that 
can have > 1 Referent. None of your expressions infer that.


AFAICT, it's more Man != Father != TrustedMan, so dependent on how you 
interpret the resource you will come to different conclusions as to what 
it identifies (x the Man or x the Father or x the TrustedMan, and so 
on), those things are all differentFrom each other, so thus it names 
different things in different contexts - but of course it's just one 
thing which can be classified in different ways.


Or, perhaps he was more referring to that fact that  does 
identify two entirely different things, not one thing that can be 
classified in two different ways.


I'd suggest "URI == 1 described thing, description open to 
interpretation" as opposed to "URI == X things" - but reality we are 
faced with is that we need to handle both.


Might be missing something..

Nathan



Re: A(nother) Guide to Publishing Linked Data Without Redirects

2010-11-11 Thread Nathan

Harry Halpin wrote:

The question is how to build Linked Data on top of *only* HTTP 200 -
the case where the data publisher either cannot alter their server
set-up (.htaccess) files or does not care to.


(1) use fragments
(2) if using slashes, admit "I was wrong" and migrate data (REAL uris do 
change)
(3) come up with a fragile set of rules fro slash URIs that only a small 
percentage of the web even knows about, let alone understands


(1) is easy, (2) is harder but possible, (3) is what will probably 
happen, and it's what people want, 200 OK and sort out the mess via 
reasoning later (hopefully).


IMHO, it really is that simple.

Nobodies willing to discuss (2), (1) isn't used 100% of the time, so it 
has to be (3).


Doesn't matter what obscure processing rules anybody here comes up with, 
most of the web won't know about them let alone implement. So it has to 
be just 200 OK your slashes and we'll either manage to sort out the mess 
later or junk the ambiguous data.


The answers aren't in the specs, this is a reality issue, and reality 
conflicts with what some want, so reality distortion is needed.


Best,

Nathan




Re: Role of URI and HTTP in Linked Data

2010-11-11 Thread Nathan

Kingsley Idehen wrote:

On 11/11/10 9:00 AM, David Booth wrote:

On Thu, 2010-11-11 at 07:23 +0100, Jiří Procházka wrote:
[ . . . ]

I think it is flawed trying to enforce "URI == 1 thing"

Exactly right.  The "URI == 1 thing" notion is myth #1 in "Resource
Identity and Semantic Extensions: Making Sense of Ambiguity":
http://dbooth.org/2010/ambiguity/paper.html#myth1
It is a good *goal*, but it is inherently unachievable.


Are you implying that a URI -- an Identifier -- doesn't have a Referent 
(singular)?


http://kingsley.idehen.name/dataspace/person/kidehen#this does not name 
you, it's not a name for you, or the name for you.


It's a name (identifier for the purpose of referencing) of "#this, as 
described by http://kingsley.idehen.name/dataspace/person/kidehen"; and 
how "#this, as described by 
http://kingsley.idehen.name/dataspace/person/kidehen"; is ultimately 
interpreted to be, depends entirely on context and application.


> If so, what is the URI identifying?

It's identifying, or referring to, "x, as described by y" and what the 
description identifies is open to interpretation and context (a human? 
an agent? a father? a trusted-man? a holder of X? a bearer of Y?).


Best,

Nathan



Re: Role of URI and HTTP in Linked Data

2010-11-11 Thread Nathan

David Booth wrote:

On Thu, 2010-11-11 at 07:23 +0100, Jiří Procházka wrote:
[ . . . ]
I think it is flawed trying to enforce "URI == 1 thing" 


Exactly right.  The "URI == 1 thing" notion is myth #1 in "Resource
Identity and Semantic Extensions: Making Sense of Ambiguity":
http://dbooth.org/2010/ambiguity/paper.html#myth1 


good paper

It is a good *goal*, but it is inherently unachievable. 


Yes, "you're interpretation of X, as described by Y" where Y is the 
graph you're currently considering in whatever context.



The important thing to keep in mind is that ambiguity is *relative* --
it depends on the application.  An application that does not need to
differentiate the toucan from its web page will still produce correct
answers even if it uses a URI the ambiguously denotes both.  However,
another application that needs to associate a different :hasOwner
property value with the toucan than the web page will need to use a
different URI for each.


Exactly, so as a prudent publisher of data it is wise not to attempt to 
constrain consideration of your data in only one specific application 
where the ambiguity doesn't matter. Similarly it may not be wise to try 
and prevent free speech by essentially saying "this is a toucan, even if 
everybody else says it's not, ignore them and what they say", others 
will speak and some applications will have to reject your 
should-be-valid statements about the toucan, since they conflict with 
their world view, where such distinctions are required.


Best,

Nathan



  1   2   3   4   >