In message <[email protected]>, Nathan <[email protected]>
writes
Thus, in addition to nudging at general awareness of these issues, I do
wonder who (if anybody) is working on spam (and unethical usage)
solutions for the web of data?
A brave choice of subject line. If it hadn't been from you (and how was
I know to it really was?) I would have deleted it unread in my
e-mail-previewing program (AKA my post-spam-filter-filter).
I think the answer is that we are wide open to this, especially with
data at the "simple triples" level. I was thinking just yesterday about
how museums might publish their collections objects as Linked Data.[1]
If they were to follow the dbpedia model, and publish a set of
[unrelated] triples with the object identifier as subject, embedded in
their web page for the object, there is nothing to stop someone else
putting out a page containing lies about that object, also expressed as
simple triples with the object URL as subject. By the time Google has
indexed both those pages "semantically" (see yesterday's acquisition of
FreeBase) and merged the results in its uber-index, you won't know the
difference. Not a likely scenario for museum objects, I guess, but very
much a possibility for commercially-sensitive and personal information.
In our domain, this is why the Europeana Data Model[2] adopts e.g. the
ORE Proxy [3] concept to specifically label a set of assertions as
coming from a known resource.
Richard
[1] http://museum-api.pbworks.com/Sample%20NMSI%20objects%20as%
20Linked%20Data [be kind, LD gurus!]
[2]
http://version1.europeana.eu/c/document_library/get_file?uuid=9783319c-90
49-436c-bdf9-25f72e85e34c&groupId=10602
[3] http://www.openarchives.org/ore/terms/Proxy
--
Richard Light