Originally sent as a private reply, though I had intended it for the list. ---------- Forwarded message ---------- Date: Sun, Nov 23, 2008 at 12:03 PM Subject: Re: [uf-discuss] hCard slowing adoption of microformats?
Dan, I do know of several instances (since corrected so I won't name names) of sites w 1000s to millions of users publishing birthdays, emails, and emailhashes (which can be used to perform unintended identity consolidation) in their FOAF files (while not on visible profile pages). The problem is that web pages are typically designed by web designers who take a very strong user-centric (privacy, expectations etc) perspective, whereas abstract format files are written by programmers, and to them such files look like a form to be filled out from a database query, so they happily do so, empirically often without considering user perspectives. Thus another tendency for such invisible data (and invisible data formats) to induce leakage of private data from databases, simply by how their design itself influences the population that supports/publishes/programs them. Republishing is a challenge for all data on the web, but users understand copy & paste of visible text on the web. They're surprised when private details become public. There is also "quantity surprise" effect when people see 1000s of pieces of text being copy/pasted/indexed, and currently the Google SG API is providing an interesting test of that expectation wrt XFN. So far the anecdotal surprises about SGAPI have been far more "wow cool" than "yikes creepy". We'll see what happens when we see web-wide hCard fielded search (more than just raw search as Y! Searchmonkey supports). Tantek -----Original Message----- From: Dan Brickley <[EMAIL PROTECTED]> Date: Sun, 23 Nov 2008 20:47:31 To: <[EMAIL PROTECTED]>; Microformats Discuss<[email protected]> Subject: Re: [uf-discuss] hCard slowing adoption of microformats? Hi Tantek, Tantek Celik wrote: > This is also a classic visible data (eg on HTML pages) vs invisible data (eg > at URLs not linked to or at least not easily viewable in browsers in > random/rare(r) XML formats) probem. > > The more visible the data, the less likely users will be surprised by having > data they may have thought was private (because they didn't see it on the > web) be scraped, aggregated, indexed, republished. > > When data *is* visible that users don't feel comfortable publishing, they > take steps to remove or make it private. > > Hence we discourage publishing of invisible data. It's user unfriendly, and > leads to far more frequent violations of user expectations. I generally agree. We discourage people from exposing anything in FOAF that isn't otherwise available in textual form in public HTML. While it seems (I never got the details confirmed before it was switched off) that Tribe may have exposed more in the RDF/XML than in the HTML, from reading through the many user comments it was the wholesale-ness of the thing that really upset people. It looked like their entire profile *and* those of their buddies had been copied/cloned. This could have equally well have been accomplished through use of curl/wget and some scraping tools, and most users wouldn't have been any the wiser, or any the happier. You can make your own mind up here, http://brainstorm.tribe.net/thread/34fb1a79-351d-4251-8318-829623c1c9cb The initial post is pretty indicative of the tone, "Can someone please tell me why my bio and all of my tribe friends are listed on a site I have never been to or heard of? I didn't think this was Tribes style. I feel cheated and betrayed. If I wanted my profile to be farmed out, I would join Facebook." Short of keeping all public profile data buried inside hard-to-parse GIFs, any markup describing profiles and linking to buddies is at risk of being 'exploited' in just this way. I think the main reason we haven't seen many complaints (about FOAF or hCard+XFN) is not the visible/invisible issue, but simply that there aren't many sites who have taken a "download the entire set of people descriptions and re-assemble them on another site" approach. Thankfully. cheers, Dan -- http://danbri.org/ _______________________________________________ microformats-discuss mailing list [email protected] http://microformats.org/mailman/listinfo/microformats-discuss
