Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Leigh Dodds
Hi,

On 19 October 2011 23:10, Jonathan Rees j...@creativecommons.org wrote:
 On Wed, Oct 19, 2011 at 5:29 PM, Leigh Dodds leigh.do...@talis.com wrote:
 Hi Jonathan

 I think what I'm interested in is what problems might surface and
 approaches for mitigating them.

 I'm sorry, the writeup was designed to do exactly that. In the example
 in the conflict section, a miscommunication (unsurfaced
 disagreement) leads to copyright infringement. Isn't that a problem?

Yes it is, and these are the issues I think that are worth teasing out.

I'm afraid though that I'll have to admit to not understanding your
specific example. There's no doubt some subtlety that I'm missing (and
a rotten head cold isn't helping). Can you humour me and expand a
little? The bit I'm struggling with is:

[[[
http://example/x xhv:license
   http://creativecommons.org/licenses/by/3.0/.

According to D2, this says that document X is licensed. According to
S2, this says that document Y is licensed
]]]

Taking the RDF data at face value, I don't see how the D2 and S2
interpretations differ. Both say that http://example/x has a
specific license. How could an S2 assuming client, assume that the
data is actually about another resource?

I looked at your specific examples, e.g. Flickr and Jamendo:

The RDFa extracted from the Flickr photo page does seem to be
ambiguous. I'm guessing the intent is to describe the license of the
photo and not the web page. But in that case, isn't the issue that
Flickr aren't being precise enough in the data they're returning?

The RDFa extracted from the Jamendo page including type information
(from the Open Graph Protocol) that says that the resource is an
album, and has a specific Creative Commons license. I think that's
what's intended isn't it?

Why does a client have to assume a specific stance (D2/S2). Why not
simply takes the data returned at face value? It's then up to the
publisher to be sure that they're making clear assertions.

 There is no heuristic that will tell you which of the two works is
 licensed in the stated way, since both interpretations are perfectly
 meaningful and useful.

 For mitigation in this case you only have a few options
 1. precoordinate (via a disambiguating rule of some kind, any kind)
 2. avoid using the URI inside ... altogether - come up with distinct
 wads of RDF for the 2 documents
 3. say locally what you think ... means, effectively treating these
 URIs as blank nodes

Cheers,

L.

-- 
Leigh Dodds
Product Lead, Kasabi
Mobile: 07850 928381
http://kasabi.com
http://talis.com

Talis Systems Ltd
43 Temple Row
Birmingham
B2 5LS



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Leigh Dodds
Hi,

On 20 October 2011 13:25, Ed Summers e...@pobox.com wrote:
 On Wed, Oct 19, 2011 at 12:59 PM, Leigh Dodds leigh.do...@talis.com wrote:
 So, can we turn things on their head a little. Instead of starting out
 from a position that we *must* have two different resources, can we
 instead highlight to people the *benefits* of having different
 identifiers? That makes it more of a best practice discussion and one
 based on trade-offs: e.g. this class of software won't be able to
 process your data correctly, or you'll be limited in how you can
 publish additional data or metadata in the future.

 I don't think I've seen anyone approach things from that perspective,
 but I can't help but think it'll be more compelling. And it also has
 the benefits of not telling people that they're right or wrong, but
 just illustrate what trade-offs they are making.

 I agree Leigh. The argument that you can't deliver an entity like a
 Galaxy to someone's browser sounds increasingly hollow to me. Nobody
 really expects that, and the concept of a Representation from
 WebArch/REST explains it away to most technical people. Plus, we now
 have examples in the wild like OpenGraphProtocol that seem to be
 delivering drinks, politicians, hotels, etc to machine agents at
 Facebook just fine.

It's the arrival of the OpenGraphProtocol which I think warrants a
more careful discussion. It seems to me that we no longer have to try
so hard to convince people that giving things de-referencable URIs
that return useful data. It's happening now, and there's immediate and
obvious benefit, i.e. integration with facebook, better searching
ranking, etc.

 But there does seem to be a valid design pattern, or even refactoring
 pattern, in httpRange-14 that is worth documenting.

Refactoring is how I've been thinking about it too. i.e. under what
situations might you want to have separate URIs for its resource and
its description? Dave Reynolds has given some good examples of that.

 Perhaps a good
 place would be http://patterns.dataincubator.org/book/? I think
 positioning httpRange-14 as a MUST instead of a SHOULD or MAY made a
 lot of sense to get the LOD experiment rolling. It got me personally
 thinking about the issue of identity in a practical way as I built web
 applications, that I probably wouldn't otherwise have otherwise done.
 But it would've been easier if grappling with it was optional, and
 there were practical examples of where it is useful, instead of having
 it be an issue of dogma.

My personal viewpoint is that it has to be optional, because there's
already a growing set of deployed examples of people not doing it (OGP
adoption), so how can we help those users understand the pitfalls
and/or the benefits of a slightly cleaner approach. We can also help
them understand how best to publish data to avoid mis-interpretation.

Simplify ridiculously just to make a point, we seem to have the
following situation:

* Create de-referencable URIs for things. Describe them with OGP
and/or Schema.org
Benefit: Facebook integration, SEO

* Above plus addition # URIs or 303s.
Benefit: ability to make some finer-grained assertions in some
specific scenarios. Tabulator is happy

Cheers,

L.

-- 
Leigh Dodds
Product Lead, Kasabi
Mobile: 07850 928381
http://kasabi.com
http://talis.com

Talis Systems Ltd
43 Temple Row
Birmingham
B2 5LS



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Leigh Dodds
Hi Dave,

Thanks for the response, there's some good examples in there. I'm glad
that this thread is bearing fruit :)

I had a question about one aspect, please excuse the clipping:

On 20 October 2011 10:34, Dave Reynolds dave.e.reyno...@gmail.com wrote:
 ...
 If you have two resources and later on it turns out you only needed one,
 no big deal just declare their equivalence. If you have one resource
 where later on it turns out you needed two then you are stuffed.

Ed referred to refactoring. So I'm curious about refactoring from a
single URI to two. Are developers necessarily stuffed, if they start
with one and later need two?

For example, what if I later changed the way I'm serving data to add a
Content-Location header (something that Ian has raised in the past,
and Michael has mentioned again recently) which points to the source
of the data being returned.

Within the returned data I can include statements about the document
at that URI referred to in the Content-Location header.

Doesn't that kind of refactoring help?

Presumably I could also just drop in a redirect and adopt the current
303 pattern without breaking anything?

Again, I'm probably missing something, but I'm happy to admit
ignorance if that draws out some useful discussion :)

Cheers,

L.

-- 
Leigh Dodds
Product Lead, Kasabi
Mobile: 07850 928381
http://kasabi.com
http://talis.com

Talis Systems Ltd
43 Temple Row
Birmingham
B2 5LS



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Leigh Dodds
Hi,

On 20 October 2011 23:19, Kingsley Idehen kide...@openlinksw.com wrote:
 On 10/20/11 5:31 PM, Dave Reynolds wrote:

 What's more I really don't think the issues is about not understanding
 about the distinction (at least in the clear cut cases). Most people I
 talk to grok the distinction, the hard bit is understanding why 303
 redirects is a sensible way of making it and caring about it enough to
 put those in place.

 What about separating the concept of indirection from its actual
 mechanics? Thus, conversations about benefits will then have the freedom to
 blossom.

 Here's a short list of immediately obvious benefits re. Linked Data (at any
 scale):

 1. access to data via data source names -- millions of developers world wide
 already do this with ODBC, JDBC, ADO.NET, OLE DB etc.. the only issue is
 that they are confined to relational database access and all its
 shortcomings

 2. integration of heterogeneous data sources -- the ability to coherently
 source and merge disparately shaped data culled from a myriad of data
 sources (e.g. blogs, wikis, calendars, social media spaces and networks, and
 anything else that's accessible by name or address reference on a network)

 3. crawling and indexing across heterogeneous data sources -- where the end
 product is persistence to a graph model database or store that supports
 declarative query language access via SPARQL (or even better a combination
 of SPARQL and SQL)

 4. etc...

 Why is all of this important?
 Data access, integration, and management has been a problem that's straddled
 every stage of computer industry evolution. Managers and end-users always
 think about data conceptually, but continue to be forced to deal with
 access, integration, and management in application logic oriented ways. In a
 nutshell, applications have been silo vectors forever, and in doing so they
 stunt the true potential of computing which (IMHO) is ultimately about our
 collective quests for improved productivity.

 No matter what we do, there are only 24 hrs in a day. Most humans taper out
 at 5-6 hrs before physiological system faults kick in, hence our implicit
 dependency of computers for handling voluminous and repetitive tasks.

 Are we there yet?
 Much closer that most imagine. Our biggest hurdle (as a community of Linked
 Data oriented professionals) is a protracted struggle re. separating
 concepts from implementation details. We burn too much time fighting
 implementation details oriented battles at the expense of grasping core
 concepts.

Maybe I'm wrong but I think people, especially on this list,
understanding the overall benefits you itemize. The reason we talk
about implementation details is they're important to help people adopt
the technology: we need specific examples.

We get the benefits you describe from inter-linked dereferenceable
URIs, regardless of what format or technology we use to achieve it.
Using the RDF model brings additional benefits.

What I'm trying to draw out in this particular thread is specific
benefits the #/303 additional abstraction brings. At the moment, they
seem pretty small in comparison to the fantastic benefits we get from
data integrated into the web.

Cheers,

L.

-- 
Leigh Dodds
Product Lead, Kasabi
Mobile: 07850 928381
http://kasabi.com
http://talis.com

Talis Systems Ltd
43 Temple Row
Birmingham
B2 5LS



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Dave Reynolds

Hi Leigh,

On 21/10/2011 08:04, Leigh Dodds wrote:

Hi Dave,

Thanks for the response, there's some good examples in there. I'm glad
that this thread is bearing fruit :)

I had a question about one aspect, please excuse the clipping:


Clipping is the secret to focused email discussions :)


On 20 October 2011 10:34, Dave Reynoldsdave.e.reyno...@gmail.com  wrote:

...
If you have two resources and later on it turns out you only needed one,
no big deal just declare their equivalence. If you have one resource
where later on it turns out you needed two then you are stuffed.


Ed referred to refactoring. So I'm curious about refactoring from a
single URI to two. Are developers necessarily stuffed, if they start
with one and later need two?

For example, what if I later changed the way I'm serving data to add a
Content-Location header (something that Ian has raised in the past,
and Michael has mentioned again recently) which points to the source
of the data being returned.

Within the returned data I can include statements about the document
at that URI referred to in the Content-Location header.

Doesn't that kind of refactoring help?


Helps yes, but I don't think it solves everything.

Suppose you have been using http://example.com/lovelypictureofm31 to 
denote M31. Some data consumers use your URI to link their data on M31 
to it. Some other consumers started linking to it in HTML as an IR 
(because they like the picture and the accompanying information, even 
though they don't care about the RDF). Now you have two groups of users 
treating the URI in different ways. This probably doesn't matter right 
now but if you decide later on you need to separate them then you can't 
introduce a new URI (whether via 303 or content-location header) without 
breaking one or other use. Not the end of the world but it's not a 
refactoring if the test cases break :)


Does that make sense?

Dave



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Leigh Dodds
Hi,

On 21 October 2011 08:47, Dave Reynolds dave.e.reyno...@gmail.com wrote:
 ...
 On 20 October 2011 10:34, Dave Reynoldsdave.e.reyno...@gmail.com  wrote:

 ...
 If you have two resources and later on it turns out you only needed one,
 no big deal just declare their equivalence. If you have one resource
 where later on it turns out you needed two then you are stuffed.

 Ed referred to refactoring. So I'm curious about refactoring from a
 single URI to two. Are developers necessarily stuffed, if they start
 with one and later need two?

 For example, what if I later changed the way I'm serving data to add a
 Content-Location header (something that Ian has raised in the past,
 and Michael has mentioned again recently) which points to the source
 of the data being returned.

 Within the returned data I can include statements about the document
 at that URI referred to in the Content-Location header.

 Doesn't that kind of refactoring help?

 Helps yes, but I don't think it solves everything.

 Suppose you have been using http://example.com/lovelypictureofm31 to denote
 M31. Some data consumers use your URI to link their data on M31 to it. Some
 other consumers started linking to it in HTML as an IR (because they like
 the picture and the accompanying information, even though they don't care
 about the RDF). Now you have two groups of users treating the URI in
 different ways. This probably doesn't matter right now but if you decide
 later on you need to separate them then you can't introduce a new URI
 (whether via 303 or content-location header) without breaking one or other
 use. Not the end of the world but it's not a refactoring if the test cases
 break :)

 Does that make sense?

No, I'm still not clear.

If I retain the original URI as the identifier for the galaxy and add
either a redirect or a Content-Location, then I don't see how I break
those linking their data to it as their statements are still made
about the original URI.

But I don't see how I'm breaking people linking to it as if it were an
IR. That group of people are using my resource ambiguously in the
first place. Their links will also still resolve to the same content.

L.


-- 
Leigh Dodds
Product Lead, Kasabi
Mobile: 07850 928381
http://kasabi.com
http://talis.com

Talis Systems Ltd
43 Temple Row
Birmingham
B2 5LS



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Jonathan Rees
On Fri, Oct 21, 2011 at 2:42 AM, Leigh Dodds leigh.do...@talis.com wrote:
 Hi,

 On 19 October 2011 23:10, Jonathan Rees j...@creativecommons.org wrote:
 On Wed, Oct 19, 2011 at 5:29 PM, Leigh Dodds leigh.do...@talis.com wrote:
 Hi Jonathan

 I think what I'm interested in is what problems might surface and
 approaches for mitigating them.

 I'm sorry, the writeup was designed to do exactly that. In the example
 in the conflict section, a miscommunication (unsurfaced
 disagreement) leads to copyright infringement. Isn't that a problem?

 Yes it is, and these are the issues I think that are worth teasing out.

 I'm afraid though that I'll have to admit to not understanding your
 specific example. There's no doubt some subtlety that I'm missing (and
 a rotten head cold isn't helping). Can you humour me and expand a
 little? The bit I'm struggling with is:

 [[[
 http://example/x xhv:license
       http://creativecommons.org/licenses/by/3.0/.

 According to D2, this says that document X is licensed. According to
 S2, this says that document Y is licensed
 ]]]

 Taking the RDF data at face value, I don't see how the D2 and S2
 interpretations differ. Both say that http://example/x has a
 specific license. How could an S2 assuming client, assume that the
 data is actually about another resource?

By observing D2. D2 is the page at that URI, it is not what is
described by the page. For example, one talks describes the image,
while the other doesn't. You get different answers. I'm not sure how
to be more clear.

 I looked at your specific examples, e.g. Flickr and Jamendo:

 The RDFa extracted from the Flickr photo page does seem to be
 ambiguous. I'm guessing the intent is to describe the license of the
 photo and not the web page. But in that case, isn't the issue that
 Flickr aren't being precise enough in the data they're returning?

If you adopt the httpRange-14 rule, what this does is make the Flickr
and Jamendo pages wrong, and if *they* agree, they will change their
metadata. The eventual advantage is that there will be no need to be
clear since a different URI (or blank node) will clearly be used to
name the photo, and will be understood in that way.

I feel you're doing a bait-and-switch here. The topic is, what does
the httpRange-14 rule do for you, NOT whether a different rule (such
as just read the RDF) is better than it for some purposes, or what
sort of agreement might we want to attempt. If you want to do a
comparison of different rules, please change the subject line.

To summarize:

- A rule is something that helps eliminate judgment and uncertainty,
and, ideally, facilitates automated processing.
- These URIs (hashless retrieval-enabled ones) are currently being
used in two different and incompatible ways. In the issue-57 document
I call these ways direct (it's the document found there) and
indirect (just read the RDF).
- If there is no rule, then you can't use one of these URIs without
further explanation as to which way is meant (being clear). Maybe
that's OK.
- Any particular rule will assign 0 or more URIs as direct and 0 or
more as indirect. Any time any URI is assigned *either* way some
benefit will ensue to someone, because uses of the URI in that way
will not require further explanation.
- The httpRange-14 rule assigns one of the two ways to all affected
URIs.  The advantage is that people who want to use URIs in this way,
will be able to use them in this way, and be understood. That is, it
gives you a way to refer to anything on the web - even if you don't
know how to read its content, don't trust the content, etc.  It is a
legacy solution since it grandfathers everything that was on the web
before we started using URIs in these new and different ways.
- Other rules will have advantages in other situations. What the
httpRange-14 rule does for you can be understood independently of the
virtues of other rules, such as the one Ian Davis put forth last fall,
or the more radical rule that says that all such URIs are indirect.
What httpRange-14 does for you is a different matter from whether
something else is better. If you want to shift to comparison shopping,
please change the subject line.

 The RDFa extracted from the Jamendo page including type information
 (from the Open Graph Protocol) that says that the resource is an
 album, and has a specific Creative Commons license. I think that's
 what's intended isn't it?

 Why does a client have to assume a specific stance (D2/S2). Why not
 simply takes the data returned at face value? It's then up to the
 publisher to be sure that they're making clear assertions.

Taking the information at face value *is* a stance - that's exactly
the S2 (indirect) approach. Saying that all hashless retrieval-enabled
URIs are indirect (S2) would be a perfectly principled and coherent
approach, it's just not the one the TAG advised in 2005.

You have to take a stand (if you use these URIs without somehow
specifying the mode) because in almost all cases D2 and 

Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Jonathan Rees
On Fri, Oct 21, 2011 at 8:15 AM, Jonathan Rees j...@creativecommons.org wrote:
 How could an S2 assuming client, assume that the
 data is actually about another resource?

 By observing D2

Sorry, I'm speaking nonsense. The point is, that if you assume S2 or
or you assume D2, you'll know (or you'll think you know) what is being
talked about. But you'll get different answers in the two situations.

If D2 (direct reference) is assumed uniformly - no problem.
If S2 (indirect) is assumed - no problem.
If sometimes one and sometimes the other - chaos if there is no
consensus rule that clearly says when one and when the other.

httpRange-14 is just one such possible rule, and it shares with the
other candidate rules the benefit of being a rule.

Jonathan



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Norman Gray

Nathan, hello.

On 2011 Oct 20, at 12:54, Nathan wrote:

 Norman Gray wrote:
 Ugh: 'IR' and 'NIR' are ugly obscurantist terms (though reasonable in their 
 original context).  Wouldn't 'Bytes' and 'Thing', respectively, be better 
 (says he, plaintively)?
 
 Both are misleading, since NIR is the set of all things, and IR is a 
 proper subset of NIR, it doesn't make much sense to label it non 
 information resource(s) when it does indeed contain information 
 resources. From that perspective IR and R makes somewhat more sense.

That's true, and clarifying.

Or, more formally, R is the set of all resources (?equivalent to things named 
by a URI).  IR is a subset of that, defined as all the things which return 200 
when you dereference them. NIR is then just R \ IR.

It's NIR that's of interest to this discussion, but there's no way of 
indicating within HTTP that a resource is in that set [1], only that something 
is in IR.

Back to your regularly scheduled argumentation...

Norman


[1] Though there is, implicitly, within any RDF that one might subsequently 
receive

-- 
Norman Gray  :  http://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK




Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Lin Clark
On Fri, Oct 21, 2011 at 1:15 PM, Jonathan Rees j...@creativecommons.orgwrote:



 If you adopt the httpRange-14 rule, what this does is make the Flickr
 and Jamendo pages wrong, and if *they* agree, they will change their
 metadata. The eventual advantage is that there will be no need to be
 clear since a different URI (or blank node) will clearly be used to
 name the photo, and will be understood in that way.

 I feel you're doing a bait-and-switch here. The topic is, what does
 the httpRange-14 rule do for you, NOT whether a different rule (such
 as just read the RDF) is better than it for some purposes, or what
 sort of agreement might we want to attempt. If you want to do a
 comparison of different rules, please change the subject line.


I don't think this was a bait-and-switch. I think Leigh made clear that he
was questioning whether we should spend so much time making pages (and
people) wrong. As he said:

Instead of starting out from a position that we *must* have two different
 resources, can we
 instead highlight to people the *benefits* of having different identifiers?


Telling someone they are wrong because they don't follow a rule that they
don't understand or don't see a benefit to is a *must* position. Explaining
how the httpRange-14 rule is better than another is explaining the
*benefits* of having different identifiers.

-Lin


Fully Funded PhD Studentship @ DDIS at the University of Zurich

2011-10-21 Thread Abraham Bernstein
The Dynamic and Distributed Information Systems Group at the University 
of Zurich (http://www.ifi.uzh.ch/ddis) is looking for a


*research doctoral student*

with a keen interest in:

* Large-scale Graph Processing
* Semantic Web / Linked Data
* Distributed Computing

to work on Signal/Collect - a framework for synchronous and asynchronous 
parallel graph processing that allows programmers to express many 
algorithms on graphs in a concise and elegant way. More information on 
Signal/Collect can be found on our project page 
(http://www.ifi.uzh.ch/ddis/research/sc.html  
http://code.google.com/p/signal-collect/).


We offer:

* motivated colleagues who are passionate about research
* a work environment that is well equipped with the newest hardware
  and software technology
* a salary according to the standard university regulations
  (  57'000 CHF / year; increases with experience)
* support for your personal development and career planning
* an attractive work environment both within the research group and 
beyond
  (Zurich is repeatedly voted the city with the highest standard of 
living

   in the world)
* A highly successful PhD program with graduates at top rated 
institutions

  world-wide

You will be collaborating in a larger research team consisting of the 
Dynamic and Distributed Information Systems Group 
(DDIS)(http://www.ifi.uzh.ch/ddis/)  headed by Prof. Abraham Bernstein, 
which is part of the Department of Informatics of the University of 
Zurich. The group is active in International and Swiss National research 
projects and we are looking for candidates to help us continue these 
efforts.


You have:

* a master's degree in informatics, computer science (or an equivalent
  university study)
* expertise in database systems, distributed computing,
  or Semantic Web / Linked Data
  (note: only expertise in one of the fields is necessary;
   multiple areas is desirable)
* good programming skills in several languages (Scala a plus)
* an interest in applying computer science research to real-world 
problems

* excellent command of English
* German is a plus but not required

If you are interested:

Send your application (including CV, final grades, and - if possible - 
thesis copy as a PDF file) via e-mail to:


  Prof. Abraham Bernstein, Ph.D.
  Department of Informatics
  University of Zurich, Switzerland
http://www.ifi.uzh.ch/ddis/bernstein.html
  Email: ddisjobs at lists dot ifi dot uzh dot ch


The University of Zurich is committed to enhancing the number of women 
in scientific positions and, therefore particularly invites women to 
apply. Women who are as qualified for the position in question as male 
applicants will be given priority.



--
-
|  Professor Abraham Bernstein, PhD
|  University of Zürich, Department of Informatics
|  web:http://www.ifi.uzh.ch/ddis/bernstein.html



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Nathan

Norman Gray wrote:

Nathan, hello.
On 2011 Oct 20, at 12:54, Nathan wrote:

Norman Gray wrote:

Ugh: 'IR' and 'NIR' are ugly obscurantist terms (though reasonable in their 
original context).  Wouldn't 'Bytes' and 'Thing', respectively, be better (says 
he, plaintively)?
Both are misleading, since NIR is the set of all things, and IR is a 
proper subset of NIR, it doesn't make much sense to label it non 
information resource(s) when it does indeed contain information 
resources. From that perspective IR and R makes somewhat more sense.


That's true, and clarifying.

Or, more formally, R is the set of all resources (?equivalent to things named by a 
URI).  IR is a subset of that, defined as all the things which return 200 when you 
dereference them. NIR is then just R \ IR.


Indeed, I just wrote pretty much the same thing, but with a looser 
definition at [1], snipped here:



The only potential clarity I have on the issue, and why I've clipped 
above, is that I feel the /only/ property that distinguishes an IR 
from anything else in the universe, is that it has a 
[transfer/transport]-protocol as a property of it. In the case of HTTP 
this would be anything that has an HTTP Interface as a property of it.


If we say that anything with this property is a member of set X.

If an interaction with the thing named p:y, using protocol 'p:', is 
successful, then p:y is a member of X.


An X of course, being what is currently called an Information Resource.

Taking this approach would then position 303 as a clear opt-out built in 
to HTTP which allows a server to remain indifferent and merely point to 
some other X which may, or may not, give one more information as to what 
p:y refers to.



[1] http://lists.w3.org/Archives/Public/www-tag/2011Oct/0078.html

That's my understanding of things any way.


It's NIR that's of interest to this discussion, but there's no way of 
indicating within HTTP that a resource is in that set [1], only that something 
is in IR.


Correct, and I guess technically, and logically, HTTP can only ever have 
awareness of things which have an HTTP Interface as a property. So 
arguing for HTTP to cater for non HTTP things, seems a little illogical 
and I guess, impossible.



Back to your regularly scheduled argumentation...


Aye, as always, carry on!

Best,

Nathan



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Kingsley Idehen

On 10/21/11 3:09 AM, Leigh Dodds wrote:
[SNIP]


What I'm trying to draw out in this particular thread is specific
benefits the #/303 additional abstraction brings. At the moment, they
seem pretty small in comparison to the fantastic benefits we get from
data integrated into the web.


Data is already integrated on the Web. The issue is quality and cost of 
said integration.


People using the Web as an information space already work with Data. The 
problem is that said Data manifests as coarse grained data objects 
(resources). Thus, people have to resort to brute force integration of 
disparate data sources. Simple example, an in ability to Find stuff with 
precision. Ditto inability to publish data object identifiers that have 
a high propensity for serendipitous discovery.


How does 303 on slash URIs help?

It enables all those existing identifiers on the Web to serve as bona 
fide linked data oriented identifiers. Basically, this is about the fact 
that Web users do the following, will continue to do so:


1. Use location names or data source names (URLs) as actual data object 
identifiers -- inherently ambiguous re. fidelity of fine grained linked data


2. Don't expect to be burdened with the mechanics of de-referencable 
identifiers that acts as names/handles

-- and rightfully so.

Linked Data solution developers (client or server side) need to accept 
the following:


1. There are, and will always be more slash based URIs than there ever 
will be hash based URIs -- blogging, tweeting, commenting ensure that


2. Name and Address disambiguation is critical to any system that deals 
with fine grained data objects -- that's how it works elsewhere and the 
Web's architecture already reflects this reality .


What about not doing a 303 on slash URIs i.e., just a 200 OK?

That's an option, but it cannot take the form of a replacement for HTTP 
303. This option introduces certain requirements on the part of linked 
data clients that includes:


1. local disambiguation of object Name and Address.
2. A dependency on relation semantics which ultimately leads to 
agreement challenges re. vocabularies -- remember, this whole effort is 
supposed to be about loose and late binding of data objects to 
vocabularies/schemas/ontologies.


Conclusion:

The fundamental benefit of slash URIs and 303 boils down to non 
disruptive manifestation of the Web's data space dimensions. Let's put 
existing global scale identifiers on the Web to good use. Technology 
vendors should take on the burden of handling linked data fidelity.


--

Regards,

Kingsley Idehen 
President  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen








smime.p7s
Description: S/MIME Cryptographic Signature


Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Jonathan Rees
On Fri, Oct 21, 2011 at 8:34 AM, Lin Clark lin.w.cl...@gmail.com wrote:


 On Fri, Oct 21, 2011 at 1:15 PM, Jonathan Rees j...@creativecommons.org
 wrote:


 If you adopt the httpRange-14 rule, what this does is make the Flickr
 and Jamendo pages wrong, and if *they* agree, they will change their
 metadata. The eventual advantage is that there will be no need to be
 clear since a different URI (or blank node) will clearly be used to
 name the photo, and will be understood in that way.

 I feel you're doing a bait-and-switch here. The topic is, what does
 the httpRange-14 rule do for you, NOT whether a different rule (such
 as just read the RDF) is better than it for some purposes, or what
 sort of agreement might we want to attempt. If you want to do a
 comparison of different rules, please change the subject line.


 I don't think this was a bait-and-switch. I think Leigh made clear that he
 was questioning whether we should spend so much time making pages (and
 people) wrong.

Come on, I never said making someone wrong was a virtue. I was just
answering honestly the question, what would happen to those pages if
we adopted the rule? Well, those pages would break. That would be sad.
Jamendo and Flickr are negative examples. This is a criticism of the
rule. Maybe that's enough reason to amend the rule, I don't know. If
you adopted a different rule, something else would break, like an
application that reports on the content of RDF pages. Because current
practice is so mixed, we will never end up with 100% compliance with
ANY rule, even one that says that all references are indirect. But
that's not what we were talking about.

I wasn't trying to argue in favor or against compared to alternatives.
I was only trying to answer the question that was asked, which was
what does it do for you. Like all rules, it lowers entropy, and does
it in a certain way that supports certain uses and doesn't support
other uses.

 As he said:

 Instead of starting out from a position that we *must* have two different
 resources, can we
 instead highlight to people the *benefits* of having
 different identifiers?

 Telling someone they are wrong because they don't follow a rule that they
 don't understand or don't see a benefit to is a *must* position. Explaining
 how the httpRange-14 rule is better than another is explaining the
 *benefits* of having different identifiers.
 -Lin

There's a different question that I skipped over because it seems
unrelated, which is whether you need different URIs for different
things. I'm not certain how to answer that. This is an
interoperability issue. If a URI U refers to two documents A and B,
and I say U has title Right, which document am I referring to, A
or B? That is, which has that title? (or author, etc.) Either you
don't care, in which case there's no reason to say it, or you care, in
which case you have to invent some additional signal to communicate
the distinction.

The question of how many URIs you need has almost nothing to do with
httpRange-14. It would arise no matter how you ended up choosing
between direct vs. indirect.

Jonathan



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Kingsley Idehen

On 10/21/11 8:57 AM, Nathan wrote:

Norman Gray wrote:

Nathan, hello.
On 2011 Oct 20, at 12:54, Nathan wrote:

Norman Gray wrote:

Ugh: 'IR' and 'NIR' are ugly obscurantist terms (though reasonable
in their original context).  Wouldn't 'Bytes' and 'Thing',
respectively, be better (says he, plaintively)?

Both are misleading, since NIR is the set of all things, and IR is a
proper subset of NIR, it doesn't make much sense to label it non
information resource(s) when it does indeed contain information
resources. From that perspective IR and R makes somewhat more
sense.


That's true, and clarifying.

Or, more formally, R is the set of all resources (?equivalent to
things named by a URI).  IR is a subset of that, defined as all the
things which return 200 when you dereference them. NIR is then just R
\ IR.


Indeed, I just wrote pretty much the same thing, but with a looser
definition at [1], snipped here:


The only potential clarity I have on the issue, and why I've clipped
above, is that I feel the /only/ property that distinguishes an IR
from anything else in the universe, is that it has a
[transfer/transport]-protocol as a property of it. In the case of HTTP
this would be anything that has an HTTP Interface as a property of it.

If we say that anything with this property is a member of set X.

If an interaction with the thing named p:y, using protocol 'p:', is
successful, then p:y is a member of X.

An X of course, being what is currently called an Information Resource.

Taking this approach would then position 303 as a clear opt-out built
in to HTTP which allows a server to remain indifferent and merely
point to some other X which may, or may not, give one more information
as to what p:y refers to.


[1] http://lists.w3.org/Archives/Public/www-tag/2011Oct/0078.html

That's my understanding of things any way.


It's NIR that's of interest to this discussion, but there's no way of
indicating within HTTP that a resource is in that set [1], only that
something is in IR.


Correct, and I guess technically, and logically, HTTP can only ever
have awareness of things which have an HTTP Interface as a property.
So arguing for HTTP to cater for non HTTP things, seems a little
illogical and I guess, impossible.


Back to your regularly scheduled argumentation...


Aye, as always, carry on!


Nice explanation.

You've just explained:

1. why http scheme based names/handles for data objects are powerful but 
unintuitive
2. why data object names, addresses, and representation must always be 
distinct.


The distinct between a URI (generic name/handle) and a URL 
(locator/address) remains the root cause of confusion. We have two 
*things* that are superficially identical (due to syntax) but 
conceptually different. The core concept is always the key to negating 
superficial distraction associated syntax :-)


Link:

1. http://tools.ietf.org/html/rfc3305 -- an interesting read re. URIs 
and URLs.




Best,

Nathan





--

Regards,

Kingsley Idehen 
President  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen









smime.p7s
Description: S/MIME Cryptographic Signature


Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Jonathan Rees
On Fri, Oct 21, 2011 at 8:33 AM, Norman Gray nor...@astro.gla.ac.uk wrote:

 Nathan, hello.

 It's NIR that's of interest to this discussion, but there's no way of 
 indicating within HTTP that a resource is in that set [1], only that 
 something is in IR.

The important distinction, I think, is not between one kind of
resource and another, but between the manner in which a URI comes to
be associated with a resource. Terminology is helpful, which is why
people have latched onto NIR, and one possibility is direct (for
old-fashioned Web URIs) and indirect (for semweb / linked data),
applied not to resources but to URIs.

A direct URI always names an IR (in fact a particular one: the one at
that URI), but an indirect one can name either an NIR or an IR (as in
the http://www.w3.org/2001/tag/2011/09/referential-use.html, and as
deployed at http://dx.doi.org/ ). HR14a says (in effect) all
retrieval-enabled hashless URIs are direct, but other rules (like Ian
Davis's) might say other things; the terms are useful independent of
the architecture.

There might be situations in which 'NIR' is a useful category, but I
don't know of any. If you say things like 303 implies NIR (which is
not justified by httpRange-14 or anything else), you get into trouble
with indirectly named IRs like those at dx.doi.org.

One could adopt a new rule that says an indirect URI cannot name an
IR, in which case if you knew the IR/NIR classification you could know
which kind of URI you had to use and vice versa, but this seems
limiting, unnecessary, and incompatible.

Jonathan



Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Norman Gray

Leigh and all, hello.

On 2011 Oct 21, at 12:52, Leigh Dodds wrote:

 Hi,
 
 On 21 October 2011 08:47, Dave Reynolds dave.e.reyno...@gmail.com wrote:
 ...
[...]
 Suppose you have been using http://example.com/lovelypictureofm31 to denote
 M31. Some data consumers use your URI to link their data on M31 to it. Some
 other consumers started linking to it in HTML as an IR (because they like
 the picture and the accompanying information, even though they don't care
 about the RDF). Now you have two groups of users treating the URI in
 different ways. This probably doesn't matter right now but if you decide
 later on you need to separate them then you can't introduce a new URI
 (whether via 303 or content-location header) without breaking one or other
 use. Not the end of the world but it's not a refactoring if the test cases
 break :)

[...]
 But I don't see how I'm breaking people linking to it as if it were an
 IR. That group of people are using my resource ambiguously in the
 first place. Their links will also still resolve to the same content.


There's always, in practice, going to be ambiguity in this space, either 
because data providers are ambiguous about what their URIs denote, or because 
data consumers misunderstand or misuse them.  The 200/303 distinction is about 
trying to force providers to making their URIs unambiguous (in an IR/NIR sense).

It's starting to sound, to me, as if the costs of this are subtle but messily 
real, and may well outweigh the benefits of a goal which is receding as more 
information providers produce ambiguous URIs, simply because their priorities 
are elsewhere (for example OGP or Facebook, if I'm understanding those two 
examples correctly).  This is an argument for conceding defeat on the 200/303 
thing.

I think we've been here before.

Back in November 2010, there was a thread about Ian Davis's suggestion that 
NIRs should simply return RDF with a 200, explaining in that RDF that they're 
NIRs http://blog.iandavis.com/2010/11/04/is-303-really-necessary/.  My 
understanding of that was 
http://lists.w3.org/Archives/Public/public-lod/2010Nov/0115.html:

 httpRange-14 requires that a URI with a 200 response MUST be an IR; a URI 
 with a 303 MAY be a NIR.
 
 Ian is (effectively) suggesting that a URI with a 200 response MAY be an IR, 
 in the sense that it is defeasibly taken to be an IR, unless this is 
 contradicted by a self-referring statement within the RDF obtained from the 
 URI.

The list of references after that message provide very interesting reading (the 
whole thread is good, and this current one is recapitulating lots of it).  
David Booth, in a couple of messages including 
http://lists.w3.org/Archives/Public/public-lod/2010Nov/0235.html, focuses on 
the ambiguity created by Ian's suggestion.

What this current thread seems to be suggesting is that this ambiguity is there 
anyway, and it's just going to get worse, so that the solutions are (a) that 
any information architect should be clear in their own mind about the IR/NIR 
distinction, and (b) that there should be ways of resolving the ambiguity in a 
non-heuristic way.  Ian's November 2010 suggestion seems to do that.

All the best,

Norman


-- 
Norman Gray  :  http://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK




Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Norman Gray

Dave, hello.

On 2011 Oct 20, at 22:31, Dave Reynolds wrote:

 Benefit 2: Conceptual cleanliness and hedging your bets
 
 [...]Even if we can't spot the practical problems right now
 then differentiating between the galaxy itself and some piece of data
 about the galaxy could turn out to be important in practice.
 
 It is.  I want to say that 'line 123 in this catalogue [an existing RDBMS] 
 and line 456 in that one both refer to the same galaxy, but they give 
 different values for its surface brightness'.  There's no way I can 
 articulate that unless I'm explicitly clear about the difference between a 
 billion suns and a database row.

[...]

 Perhaps benefit 2 could be reframed as being about forcing you to
 confront the map/territory distinction so you end up doing better
 modelling - whether or not you implement 303s.

I think that's _very_ true.  Perhaps one can say that any information 
architect should understand the IR/NIR distinction, however they subsequently 
decide to represent this.

 I think the discussion Leigh was trying to start was can we more
 clearly article those benefits of the 'right way'. I was taking a shot
 a that, maybe a very limited off-target one.

While I think it's very important to be clear about precisely what one's URIs 
refer to, I'm starting to wonder if the benefits of the 'right way' (which is 
the IR/NIR and 200/303 distinction, right?) really are all that massive.

I think your listing of the costs and benefits 
http://lists.w3.org/Archives/Public/public-lod/2011Oct/0158.html is a useful 
summary.

 Most people I
 talk to grok the distinction, the hard bit is understanding why 303
 redirects is a sensible way of making it and caring about it enough to
 put those in place.

Yes: it's becoming clearer to me that this is what the discussion is really 
about, even though it started off being about the lament why don't people 
understand this distinction?.



You also commented on ways to represent observational data.

 (1) Describe the observations explicitly using something like ISO OM or
 the DataCube vocabulary:
 
   http://catalogue1.com/observation123 a qb:Observation;
   eg:galaxy  http://iau.org/id/galaxy/m31;
   eg:brightness  6.5 ;
   eg:obsdate '2011-10-10'^^xsd:date ;
   qb:dataset http://catalogue1.com/catalogue/2011 .
 
   http://catalogue2.com/observation456 a qb:Observation;
   eg:galaxy  http://iau.org/id/galaxy/m31;
   eg:brightness  6.8 ;
   eg:obsdate '2011-09-01'^^xsd:date ;
   qb:dataset http://catalogue2.com/catalogue/2011 .
 
 (2) Each catalogue gives its own URI to its understanding of the
 galaxy so it can assert things directly about it without conflict:
 
   http://catalogue1.com/galaxy/m31  eg:brightness 6.5;
  eg:correspondsTohttp://iau.org/id/galaxy/m31 .
 
   http://catalogue2.com/galaxy/m31  eg:brightness 6.8;
  eg:correspondsTohttp://iau.org/id/galaxy/m31 .

For huge numbers of objects, the _only_ name they have is their number in some 
observational catalogue or other -- there's no canonical IAU name.  In a 
current project, we're setting up the support to be able to say 

http://catalogue1.com/galaxy/123 cat1:brightness xxx.
http://catalogue2.com/galaxy/456 cat2:brightness yyy.
http://catalogue1.com/galaxy/123 owl:sameAs 
http://catalogue2/galaxy/456.

We probably also want to reify the database rows where the first two statements 
come from, in order to make last-modified-like statements about them, but 
whether we do that with a named graph, or some other way, is a problem we 
haven't had to confront quite yet.

 In *none* of those cases doesn't it make any difference whether when I
 dereference http://iau.org/id/galaxy/m31 in a browser I get a web page
 saying I denote the galaxy M31 or I get a 303 redirect to something
 like http://iau.org/doc/galaxy/m31 which in turn connegs to a web page
 saying The URI you started with denoted the galaxy M31, me I'm just a
 web page, you can tell me by the way I walk.

Well, I think it does matter, because in this case, the thing named 
http://catalogue1.com/galaxy/123 could plausibly be either the galaxy or the 
database row (and I suppose I could claim the latter as a NIR, with a following 
wind), and I'd need to be able to state, somewhere, which it is.  But that's 
handled by my providing some RDF somewhere which explains which it is: the 
problem is how to get to that RDF without drawing some ambiguous or wrong 
conclusions on the way.

Best wishes,

Norman


-- 
Norman Gray  :  http://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK




Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread David Booth
On Fri, 2011-10-21 at 09:17 -0400, Jonathan Rees wrote:
[ . . . ]
 There's a different question that I skipped over because it seems
 unrelated, which is whether you need different URIs for different
 things. 

+1 

 I'm not certain how to answer that. This is an
 interoperability issue. If a URI U refers to two documents A and B,
 and I say U has title Right, which document am I referring to, A
 or B? That is, which has that title? (or author, etc.) Either you
 don't care, in which case there's no reason to say it, or you care, in
 which case you have to invent some additional signal to communicate
 the distinction.

Right, though I would call it an application issue rather than an
interoperability issue, because whether or not it is important to
distinguish the two depends entirely on the application.
Ambiguity/unambiguity should not be viewed as an absolute, but as
*relative* to a particular application or class of applications: a URI
that is completely unambiguous to one application may be hopelessly
ambiguous to a different application that requires finer distinctions.
See Resource Identity and Semantic Extensions: Making Sense of
Ambiguity
http://dbooth.org/2010/ambiguity/paper.html 

 
 The question of how many URIs you need has almost nothing to do with
 httpRange-14. It would arise no matter how you ended up choosing
 between direct vs. indirect.

+1.  With or without httpRange-14, there will always be URIs that are
unambiguous to some applications and ambiguous to others.  This is the
inescapable consequence of the fact that, for the most part, it is
impossible to define anything completely unambiguously -- a principle
well discussed and established in philosophy.



-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.




Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Norman Gray

Jonathan, hello.

On 2011 Oct 21, at 14:46, Jonathan Rees wrote:

 A direct URI always names an IR (in fact a particular one: the one at
 that URI), but an indirect one can name either an NIR or an IR (as in
 the http://www.w3.org/2001/tag/2011/09/referential-use.html, and as
 deployed at http://dx.doi.org/ ). HR14a says (in effect) all
 retrieval-enabled hashless URIs are direct, but other rules (like Ian
 Davis's) might say other things; the terms are useful independent of
 the architecture.
 
 There might be situations in which 'NIR' is a useful category, but I
 don't know of any.

I can see that distinction, and the value in it.  I still think that 'NIR' is a 
useful category -- in a way it's the simpler category of the two: you cannot 
download a NIR, no matter how many indirections you follow, whereas if you 
start following indirect links, you might end up at a direct link.  Or: an NIR 
is one of the 'things' that's being talked about in the 'internet of things'.

 If you say things like 303 implies NIR (which is
 not justified by httpRange-14 or anything else),

I don't think anyone's so confused as to say 303 implies NIR.  A lot of 
things would probably be simpler, though, if there were a 20x or 30x status 
code which did mean this names an NIR, and the content is just commentary on, 
or depictions of, that thing (that's not a suggestion, by the way!)

All the best,

Norman


-- 
Norman Gray  :  http://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK




Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Kingsley Idehen

On 10/21/11 10:53 AM, David Booth wrote:

Right, though I would call it an application issue rather than an
interoperability issue, because whether or not it is important to
distinguish the two depends entirely on the application.
Ambiguity/unambiguity should not be viewed as an absolute, but as
*relative*  to a particular application or class of applications: a URI
that is completely unambiguous to one application may be hopelessly
ambiguous to a different application that requires finer distinctions.
See Resource Identity and Semantic Extensions: Making Sense of
Ambiguity
http://dbooth.org/2010/ambiguity/paper.html

+1

Examples of different applications/services where the above applies:

1. World Wide Web -- as a global information space.
2. World Wide Web -- as a global data space.
3. World Wide Web -- as a global knowledge space.

httpRange-14 enables Web users straddle the items above without 
consequence. The hyperlink is still the driver of application experience.
  
  The question of how many URIs you need has almost nothing to do with

  httpRange-14. It would arise no matter how you ended up choosing
  between direct vs. indirect.

+1.  With or without httpRange-14, there will always be URIs that are
unambiguous to some applications and ambiguous to others.  This is the
inescapable consequence of the fact that, for the most part, it is
impossible to define anything completely unambiguously -- a principle
well discussed and established in philosophy.


+1


--

Regards,

Kingsley Idehen 
President  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen







smime.p7s
Description: S/MIME Cryptographic Signature


Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

2011-10-21 Thread Dave Reynolds

On 21/10/2011 12:52, Leigh Dodds wrote:

Hi,

On 21 October 2011 08:47, Dave Reynoldsdave.e.reyno...@gmail.com  wrote:

...

On 20 October 2011 10:34, Dave Reynoldsdave.e.reyno...@gmail.comwrote:


...
If you have two resources and later on it turns out you only needed one,
no big deal just declare their equivalence. If you have one resource
where later on it turns out you needed two then you are stuffed.


Ed referred to refactoring. So I'm curious about refactoring from a
single URI to two. Are developers necessarily stuffed, if they start
with one and later need two?

For example, what if I later changed the way I'm serving data to add a
Content-Location header (something that Ian has raised in the past,
and Michael has mentioned again recently) which points to the source
of the data being returned.

Within the returned data I can include statements about the document
at that URI referred to in the Content-Location header.

Doesn't that kind of refactoring help?


Helps yes, but I don't think it solves everything.

Suppose you have been using http://example.com/lovelypictureofm31 to denote
M31. Some data consumers use your URI to link their data on M31 to it. Some
other consumers started linking to it in HTML as an IR (because they like
the picture and the accompanying information, even though they don't care
about the RDF). Now you have two groups of users treating the URI in
different ways. This probably doesn't matter right now but if you decide
later on you need to separate them then you can't introduce a new URI
(whether via 303 or content-location header) without breaking one or other
use. Not the end of the world but it's not a refactoring if the test cases
break :)

Does that make sense?


No, I'm still not clear.

If I retain the original URI as the identifier for the galaxy and add
either a redirect or a Content-Location, then I don't see how I break
those linking their data to it as their statements are still made
about the original URI.

But I don't see how I'm breaking people linking to it as if it were an
IR. That group of people are using my resource ambiguously in the
first place. Their links will also still resolve to the same content.


Ah OK. So you introduce a new, different IR, but preserve the conneg so 
that old HTML pages links to the picture still resolve. Yes you are 
right, I think that does work.


Dave