Hi Hugh.
I have avoided participating in these httpRange-14 debates, but since
you have brought the Facebook Linked Data into the discussion, I feel
compelled to respond. The goal (or my goal) regarding Facebook's
Linked Data provided through its Graph API was to allow for sensible
Linked Data RDF to be published in a way that did not interfere with
maintenance of existing code and in a way that would require very
little maintenance in the future. Please see my inline comments
below, and also some comments at the end.
On Mar 28, 2012, at 6:44 AM, Hugh Glaser wrote:
Executive summary:
TAG, please don't come back with something that does not allow, or
even encourage, sites like Facebook to offer RDF back in return for:
curl -L -H Accept:application/rdf+xml https://www.facebook.com/hugh.glaser
Challenge: Try telling me what to put in sameAs.org for the LD URI
for you on Facebook.
Detail:
I support Jeni et al.'s Proposal, because it is an improvement, and
seems to have some chance of success.
Actually, I am pretty sure I align with Giovanni and his ilk.
My preference is to lose the whole thing (and these discussions!) -
but there is no point, I think, in proposing that because it has no
chance of success.
When people talk about "users", they seem to mean developers.
With regard to Facebook's Graph API, it is indeed targeted toward
developers (Linked Data or otherwise).
The users I think of are the eyeballs that look at and manipulate
the stuff on their screens, usually in a browser.
Also, when a posting on this list has:
"Well, if I wanted to do this, " or "Imagine…"
my own eyeballs sort of glaze over.
Well, there have been 6 years to do it or for someone else to
actually feel the need to do it - if it hasn't blazed a trail in the
huge range of Linked Data-enabled applications (irony intended)
being used by users out there, then it probably isn't a very
important use case.
My slightly shorter story (thanks Dan, that was great, and I read
the whole thing!) involves Facebook as a LD site.
In fact, I think this story is complementary to Dan's, as it gives
some view of the experience that Bob's users will get after Alice's
consultation and the subsequent implementation.
This actually happened to me last night.
Recalling that I now have a LD ID on Facebook, I go to Facebook and
get my ID (well, I think of it as my ID, and it's what I give anyone
if they ask for a link to "me").
https://www.facebook.com/hugh.glaser
(I could stop there, as we all know I already have a problem, but …)
Being a brave little chap, before putting it in my signature as one
of my LD IDs, I decide to check that this is OK, by pasting it into
something that wants a LD ID, such as the W3C validator (in this
case I use curl -H Accept:application/rdf+xml).
It actually gave a 200, so it must be OK, right?
Of course, this doesn't validate because the URI actually does 302 -
> 200 and returns text/html in response to my curl.
506 would have been possibly less helpful, by the way.
So I am done - nothing I can do now.
However, being not only brave, but also intrepid, I start googling
for support.
I eventually (it wasn't easy), find that I should be using graph
instead of www.
With excitement, I try
curl -i -L -H Accept:application/rdf+xml https://graph.facebook.com/hugh.glaser
Close, but no cigar.
I get text/javascript back.
More digging (I'll spare you the details)...
curl -i -L -H Accept:text/turtle https://graph.facebook.com/
hugh.glaser
I cannot contain my excitement; I have some RDF at last!
So I can use https://graph.facebook.com/hugh.glaser as my Facebook
LD ID.
Er, not quite.
The turtle this returns is
</720591128#>
user:id "720591128" ;
Ah yes, I knew I had a numeric ID, 720591128 - so it being late I
guess my LD ID is https://graph.facebook.com/720591128
Of course, er no, not quite again.
I suddenly notice a little # lurking in the turtle.
So I finally decide that the URI I should put in my signature is
https://graph.facebook.com/720591128#
Of course, this is sufficiently ugly, compared with
https://www.facebook.com/hugh.glaser
that I don't bother, and go to bed.
I'm surprised that perceived ugliness of a URI (although it is not so
ugly to me; beauty is in the eye of the beholder) would deter someone
from taking advantage of the Linked Data. The only differences --- as
you have pointed out --- is that graph should be used instead of www,
the FBID 720591128 is used instead of hugh.glaser, and the Linked Data
URI has (what I call) an empty fragment. Here are the reasons for
these differences:
1. I think (without certainty) that it is Facebook's intention that
everything at www.facebook.com be for human eyeballs. Admittedly,
there could be some RDFa, and for some pages, there is RDFa containing
Open Graph Protocol markup (do not conflate the Open Graph Protocol
and the Graph API). "Raw" data is made available --- targeting
developers --- via the Graph API at http://graph.facebook.com (if you
click that link without adding a path, it will redirect to
documentation).
2. The FBID is used instead of the relative "vanity URL" (e.g., /
hugh.glaser) because not every user has a vanity URL, and even if each
user did, not every *thing* has a vanity URL. The Graph API provides
more than just data about users, and to quote Facebook's documention ( https://developers.facebook.com/docs/reference/api/
): "Every object in the social graph has a unique ID."
3. The use of the empty fragment is the easiest way to take advantage
of how the Graph API works. Prior to serving up text/turtle, the
Graph API served up only JSON at, e.g., http://graph.facebook.com/720591128
. That is the place to find data about you. With little
interference to existing code, when text/turtle is requested, the JSON
is merely translated into text/turtle, making use of the internal
system to provide meaningful semantics. One of the problems is that a
URI needs to be minted for instances (e.g., a user), and given
httpRange-14, I have the choice of using a hash URI and returning 200
OK or using a slash URI and 303'ing to somewhere else. Using the
empty fragment seemed like the most acceptable option. (See dialogue
at the end of this email.)
Now I'm not saying that the TAG is going to solve all these issues.
And there are lots of issues about 303 and # and RDFa …
But I think this is a real Use Case for a user, which should mean
that the developer who provides this system (Facebook) is a Use Case
for the TAG.
The developer of the Linked Data would be me. I worked on this while
interning at Facebook during the summer of 2011. I have since
returned to RPI to continue working toward my Ph.D.
I could have gone through a very similar process with almost any
Linked Data site, such as ePrints, myexperiment and dbpdedia
(including my own, such as RKBExplorer) - it just happened I wanted
Facebook last night.
And Linked Data people go around saying hows exciting it is that
Facebook is offering Linked Data - I can't possibly use this as an
example to a customer, such as Dan's Bob.
This whole experience is just crap.
Perhaps that experience was unpleasant. Here's a marginally better one:
1. When you log into Facebook and go to your timeline (your own page),
the path of the URL in the browser either looks like, e.g., /
hugh.glasier or /profile.php?id=720591128 . In the latter case, you
have already found your FBID.
2. If you have a vanity URL, like /hugh.glasier , simply do a HTTP GET
for http://graph.facebook.com/hugh.glasier , and that contains your
FBID.
3. The URI representing you is http://graph.facebook.com/FBID# , where
FBID should be the FBID number.
Yes, there is the HTTPS discrepancy, and yes, this probably isn't
ideal in terms of discovering the URI that identifies a user.
If I had trouble with this, exactly what does Facebook expect a
normal user to do?
I'm sure we can point out ways in which Facebook might have done
things better, but that is not the point.
Although I no longer work at Facebook, I would be interesting in such
"ways in which Facebook might have done things better." That
discussion would be more appropriate in another thread.
Can they actually make it easy for users using the current or
proposed standards?
TAG, please don't come back with something that does not allow, or
even encourage, sites like Facebook to offer RDF back in return for:
curl -H Accept:application/rdf+xml https://www.facebook.com/
hugh.glaser
Best
Hugh
PS
I left the https in, because that is actually what cut and paste
gave me.
I'm guessing that would have been a whole new thread.
http works, too, unless you're trying to access permissions-protected
data, in which case you need to use https and provide a security
token. I'm not sure what the implications are regarding http/https
URIs in Linked Data. Indeed, that would be a whole new thread.
PPS
If you read through to here, or even if you just skipped to here,
then if you really do send me your Facebook LD URI (along with one
of more other ones to pair it with), I will drop everything and put
them in sameAs.org :-)
--
Hugh Glaser,
Web and Internet Science
Electronics and Computer Science,
University of Southampton,
Southampton SO17 1BJ
Work: +44 23 8059 3670, Fax: +44 23 8059 3045
Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652
http://www.ecs.soton.ac.uk/~hg/
Finally, I would like to respond to an earlier comment made by Tom
Heath (sorry for the incomplete-looking cut-and-paste): "a rigorous
assessment of how difficult people *really* find it to understand
distinctions such as 'things vs documents about things'. I've heard
many people claim that they've failed to explain this (or similar)
successfully to developers/adopters; my personal experience is that
everyone gets it, it's no big deal (and IRs/NIRs would probably never
enter into the discussion)." My experience at Facebook agrees with
Tom Heath's experience. Understanding the distinction between
"things" versus "documents about things" was easily understood. The
main source of contention was around its pragmatism and necessity.
One developer said to me (paraphrase): "I would conflate documents and
things if I could." It is a strange statement to me, but
nevertheless, the distinction was understood.
In the fashion of Dan Brickley, I would like to present another
_hypothetical_ dialogue, one between a proponent of Linked Data and a
typical web developer (although perhaps not quite as clever and
thorough as Dan's).
BEGIN DIALOGUE
Proponent: "I found a way to meaningfully publish our already-
published data as Linked Data, and I've implemented a prototype."
Developer: "Since you've already done it, let's take a look."
Proponent: "Okay, go to [link]."
Developer: "Hmmmm... [skip discussion about Turtle vs. RDF/XML].
Everything looks okay, except I notice these URIs have #me at the
end. Why? Can't we just lose the fragment?"
Proponent: "Well, URIs are used to identify things both on and off the
web. For example, no HTTP GET will ever squeeze you over a cable and
pop you up in my browser."
Developer: "Sure. So what?"
Proponent: "... so we need a way to mint URIs for both things on and
off the web that makes sense with how the web already works."
Developer: "Okay, but why the fragment?"
Proponent: "I'm getting to that. The current standard (which shall
not be named) is based on the notion that any URI for which a HTTP GET
returns with 200 OK (these are URIs without fragments) represents the
document that is retrieved, that is, something *on* the web."
Developer: "Okay... seems logical."
Proponent: "So some conventions have been made for how to identify
things *off* the web. One is to simply add a fragment (understatement
meant to avoid confusion at this point), and that can identify
something *off* the web."
Developer: "So I have to have a fragment? It seems unnecessary and
ugly."
Proponent: "There is an alternative. You can use a URI without a
fragment, but then doing an HTTP GET on the URI must return a 303
which redirects to a document about the thing the URI represents."
Developer: "303? What is that?"
Proponent: "See Other."
Developer: "Never heard of that. I don't want to have to create
another service just to 303 redirect to already-available data. Seems
superfluous. Is there any other way?"
Proponent: "Well, we could actually let the URIs 404. It's not ideal,
but it's legal."
Developer: "No, I don't want anything to 404. Never mind then. What
about this #me? Why 'me'?"
Proponent: "Well, that's just a common convention for saying that
[URL] returns information about [URL]#me. #this is another common one."
Developer: "Hmmm... I don't know about that."
Proponent: "Well, if we don't want to 404, and we don't want to
support 303, we'll need some kind of fragment to conform with the
current standard. We could just have an empty fragment so that the
changes are minimal, both in terms of effort and appearance."
Developer: "Okay... I guess... let's go with that, then."
END DIALOGUE
Glean from the dialogue what you will. How would I describe
httpRange-14? Minimally sufficient.
Jesse Weaver
Ph.D. Student, Patroon Fellow
Tetherless World Constellation
Rensselaer Polytechnic Institute
http://www.cs.rpi.edu/~weavej3/index.xhtml