Re: spam

Richard Light Sat, 17 Jul 2010 04:05:05 -0700

In message <[email protected]>, Nathan <[email protected]>writes

Thus, in addition to nudging at general awareness of these issues, I dowonder who (if anybody) is working on spam (and unethical usage)solutions for the web of data?

A brave choice of subject line. If it hadn't been from you (and how wasI know to it really was?) I would have deleted it unread in mye-mail-previewing program (AKA my post-spam-filter-filter).

I think the answer is that we are wide open to this, especially withdata at the "simple triples" level. I was thinking just yesterday abouthow museums might publish their collections objects as Linked Data.[1]

If they were to follow the dbpedia model, and publish a set of[unrelated] triples with the object identifier as subject, embedded intheir web page for the object, there is nothing to stop someone elseputting out a page containing lies about that object, also expressed assimple triples with the object URL as subject. By the time Google hasindexed both those pages "semantically" (see yesterday's acquisition ofFreeBase) and merged the results in its uber-index, you won't know thedifference. Not a likely scenario for museum objects, I guess, but verymuch a possibility for commercially-sensitive and personal information.

In our domain, this is why the Europeana Data Model[2] adopts e.g. theORE Proxy [3] concept to specifically label a set of assertions ascoming from a known resource.


Richard

[1] http://museum-api.pbworks.com/Sample%20NMSI%20objects%20as%
20Linked%20Data [be kind, LD gurus!]

[2]http://version1.europeana.eu/c/document_library/get_file?uuid=9783319c-90

49-436c-bdf9-25f72e85e34c&groupId=10602
[3] http://www.openarchives.org/ore/terms/Proxy
--
Richard Light

Re: spam

Reply via email to