Re: Squaring the HTTP-range-14 circle

Kingsley Idehen Thu, 16 Jun 2011 13:18:55 -0700

On 6/16/11 6:53 PM, Giovanni Tummarello wrote:

Hi Tim ,
"documents" per se (a la HTTP response 200 response) on the web areless and less relevant as opposed to the "conceptual entities" thatare represented by this document and held e.g. as DB records insideCMS, social networks etc.
e.g. a social network is about "people" those are the importantentities. Then there might be 1000 different HTTP documents that youcan get e.g.i f you're logged if you're not logged, if you have acookie if you have another cookie, if you add &format=print. SpecificURLs are pretty irrelevant as they contain all sort of extra information.
Layouts of CMS or web apps change all the time (and so do the HTMLdocs) but not the entities.
that's why "http response 200" level annotations are of such littleambiguity really you say you have so many annotations about documents,i honestly dont understand what you're referring to, are these HTTPretrievable documents?

Tim is saying, and pretty clearly: there are a lot of resources in HTMLformat on the Web. You access these via URLs (Addresses). Basically, youGET data from Addresses.

where are the annotations? are we talking about the http headers?about the "meta" tags in the <head> these are about the subject of thepage too most of the time, not the page itself.

In these resources (projected as HTML documents) there is a lot ofmetadata. Dig a little deeper, there are also varying degrees ofmetadata in the HTTP responses. What doesn't exist is use of anabstraction whereby the Subject Matter items (what the HTML docs areabout) are Identified by Names that resolve to their Representationswhich are best served via graph pictorials.

and this is the idea behind schema.org <http://schema.org> (opengraphwhatever) which sorry Tim you have to live with and we have to do themost with.

Tim is not saying he has a problem with schema.org. He might imply thatschema.org is deviating from the aspect of AWWW that delivers theabstraction necessary for schema.org to refer to entities (data objects)by Names that are distinct from the Addresses of their Representation(s).

When someone refers to a URL which embeds a opengraph or schema.org<http://schema.org> annotation then it is 99.+ (with the number of 9augmenting as the web evolves to a rich app platform) certain thatthey refer to the entity described in it and not to the web documentitself (which can and does change all the time and is of overall noconceptual relevance).

We are already dealing with the schema.org issues [1][2] the best way itcan be handled until opportunity costs veer them towards upping thesemantic fidelity of their contribution.

We can live with schema.org, but lets not conflate that effort with somevital fundamentals re. Linked Data and best practices based on AWWW. Inmy eyes, schema.org is a massive vector re. structured data injectedinto the Web. Semantic fidelity was never their focus. Basically, asstated in an older post, Google, Microsoft, Yahoo!, Facebook andfriends, all seek to contribute structured data from their respectivedata spaces, this already makes sense to them and is 100% compatiblewith their respective business models. Naturally, we would like them todo more, but you can't tell them to do more, all you can do is makeopportunity costs palpable to them and eventually they will respond.

With respect to schema.org <http://schema.org>, we (as semantic webcommunity) have not been ignored: our work and proposals have beenvery well considered and then diregarded alltogether - and for severalreasons : 12 years of work, not an agreement on ontology, not an easyway for people to publish data ( the 303 thing is a complete totalutter insanity (as i had said in vain so many times) ). etc.

303 isn't insanity! It is basic computing re. data access by reference.de-reference (indirection) and address-of operations are fundamentalelements of any kind of environment that allows access, movement, andmanipulation of data. You always have Names and Addresses. In fact, youhave them in the real world, but I don't want veer down a discussion onsemiotics and philosophy.

The Web of Documents works because Document Addresses (URLs) have becomeintuitive. Evolving the Web to a Linked Data Space is a little trickierwith HTTP URIs because the Name operation unveils a powerful butunnatural abstraction due to the fact that an HTTP URI based Name looksand feels like an HTTP URI based Location Name (Address).

HTTP 303 is just doing what programming languages do behind the sceneswhenever you access data objects by reference. If anything, its atribute to the flexibility of the HTTP protocol. Basically, we have theWeb now pulling of the same data access and manipulation capabilitiesthat host operating systems have delivered to systems developers sinceforever.


[SNIP]

Kingsley

Gio

On Thu, Jun 16, 2011 at 7:04 PM, Tim Berners-Lee <[email protected]<mailto:[email protected]>> wrote:


    I disagree with this post very strongly, and it is hard to know
    where to start,
    and I am surprised to see it.

    On 2011-06 -13, at 07:41, Richard Cyganiak wrote:

    > On 13 Jun 2011, at 09:59, Christopher Gutteridge wrote:
    >> The real problem seems to me that making resolvable, HTTP URIs
    for real world things was a clever but dirty hack and does not
    make any semantic sense.
    >
    > Well, you worry about *real-world things*, but even people who
    just worry about *documents* have said for two decades that the
    web is broken because it conflates names and addresses.

    No, some people didn't get the architecture in that they had
    learned systems where there that
    there was a big distinction between names and address, and they
    had different properties,
    and then they came across URIs which had properties of both.


    > And they keep proposing things like URNs and info: URIs and tag:
    URIs and XRIs and DOIs to fix that and to separate the naming
    concern from the address concern. And invariably, these things
    fizzle around in their little niche for a while and then mostly
    die, because this aspect that you call a “clever but dirty hack”
    is just SO INCREDIBLY USEFUL. And being useful trumps making
    semantic sense.

    I agree ... except that ther URI architectre being like names and
    like addresses isn't a "clever but dirty hack".

    You then connect this with the idea of using HTTP URIs for
    real-world things, which is a separate queston.
    This again is a question of architecture. Of design of a system.
    We can make it work either way.
    We have to work out which is best.

    I don't think 303 is a quick and dirty hack.
    It does mean a large extension of HTTP to be uses with non-documents.
    It does have efficiency problems.
    It is an architectural extension to the web architecture.

    >
    > HTTP has been successfully conflating names and addresses since
    1989.

    That is COMPLETELY irrelevant.
    It is not a question of the web being fuzzy or ambiguous and
    getting away with it.
    It is a clean architecture where the concepts of "name" and
    "address" don't connect directly with those of people or files on
    a disk or IP hosts.


    >
    > There is a trillion web pages out there, all named with URIs.
    And even if just 0.1% of these pages are unambiguously about a
    single specific thing, that gives us a billion free identifiers
    for real-world entities, all already equipped with rich
    *human-readable* representations, and already linked and
    interconnected with *human-readable*, untyped, @href links.
    >
    > And these one billion URIs are plain old http:// URIs. They
    don't have a thing:// in the beginning, nor a tdb://, nor a #this
    or #that in the end, nor do they respond with 303 redirects or to
    MGET requests or whatever other nutty proposals we have come up
    with over the years to disambiguate between page and topic. They
    are plain old http:// URIs. A billion.
    >
    > Then add to that another huge number that already responds with
    JSON or XML descriptions of some interesting entity, like the one
    from Facebook that Kingsley mentioned today in a parallel thread.
    Again, no thing:// or tdb:// or #this or 303 or MGET on any of them.
    >
    > I want to use these URIs as identifiers in my data, and I have
    no intention of redirecting through an intermediate blank node
    just because the TAG fucked up some years ago.

    If you want to give yourself the luxury of being able to refer to
    the subject of a webpage, without having to add anthing to
    disambiguate it from the web page, then for the sake of your
    system, so you can use the billion web pages for your purposes,
    then you now stop other like me from using semantic web systems to
    refer to those web pages, or in fact to the other hundred million
    web pages either.

    Maybe you should an efficient way of doing what you want without
    destroying the system (which you as well have done so much to build)



    >
    > I want to tell the publishers of these web pages that they could
    join the web of data just by adding a few @rels to some <a>s, and
    a few @properties to some <span>s, and a few @typeofs to some
    <div>s (or @itemtypes and @itemprops). And I don't want to explain
    to them that they should also change http:// to thing:// or tdb://
    or add #this or #that or make their stuff respond with 303 or to
    MGET requests because you can't squeeze a dog through an HTTP
    connection.

    Well actually I really want them to put metadata about BOTH the
    document and its subject.

    There is masses of metadata already about documents.

    Now you want to make it ambiguous so I don't know whether it is
    about the document or its subject?

    I don't think something like about="#product" is rocket science or
    unnatural.

    I really want people to be able to use RDF or microdata to say
    things about more than one thing in the same page

    >
    > And here you and Pat and Alan (and TimBL, for that matter) are
    preaching that we can't use this one billion of fantastic free
    URIs to identify things because it wouldn't make semantic sense.

    We are saying that actually we already are using them to refer to
    the web pages and that that is very important and so is all the
    existing web.

    >
    > Being useful trumps making semantic sense.

    That is romantic nonsense.  To be useful you need clean extensible
    architecture,
    well defined concepts.

    > The web succeeded *because* it conflates name and address.

    That is completely irrelevant nonsense.


    It succeeded with a clean architecture using URIs for web pages,
    and the # as punctuation syntax between the identifier of the page
    and the local identifier within the page.


    > The web of data will succeed *because* it conflates a thing and
    a web page about the thing.
    >
    > <http://richard.cyganiak.de/>
    >    a foaf:Document;
    >    dc:title "Richard Cyganiak's homepage";
    >    a foaf:Person;
    >    foaf:name "Richard Cyganiak";
    >    owl:sameAs <http://twitter.com/cygri>;
    >    .
    >
    > There.
    >
    > If your knowledge representation formalism isn't smart enough to
    make sense of that, then it may just not be quite ready for the
    web, and you may have some work to do.

    Formalisms aren't smart.
    Sure, I can make a program to make sense of that.
    But I'm not going to just to save you the effort of getting it right.

    Disappointed by the intensity of your posting.
    Systems have managed for a long time to distinguish between
    library car and book,
    between message header and message,
    between a book and its subject.

    Now we have masses of information about many books
    and about many other things we have great value in it
    Let's not mess it up.

    If you want an ambiguous source of information, use natural language.
    The power of data is that is a whole lot less ambiguous.

    Tim

    >
    > Best,
    > Richard
    >



--

Regards,

Kingsley Idehen 
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen

Re: Squaring the HTTP-range-14 circle

Reply via email to