Re: [Autonomo.us] Inverting the Web, or an Internet Bank of Content

Noah Slater Thu, 11 Jun 2009 14:46:22 -0700

On Thu, Jun 11, 2009 at 09:27:32PM +0100, Chris Dent wrote:
> To anybody about 15 years ago, after the web was born, but before things
> got really grooving, much of what Flickr does would have been utterly
> draw dropping.
>
> So it makes perfect sense to me that in _less_ than 15 years your
> concerns about speed, content validation, and UI integration will be
> moot.


This seems to be a variant of Moore's law you're arguing.

However, this misses the point. What you count as a feature is fundamentally the
reason no commercial site would implement this. Why would a commercial site like
this put the quality of its service in the hands of its users?

There's just so many problems with this, I can't see it every working. When your
revenue stream depends on it, you had better make sure that you can guarantee a
certain level of service for your users.

> It also makes some sense that content replacement scenarios wherein I replace
> a picture of my cute kitten with something with a similar name but less rated
> G will be in a different social context. First off, in an environment where
> there is concern about such things happening, the presentation service can
> maintaining checksums of data and other comparators so they can tell when
> something is or is not what it claims to be.

Sure, but what kind of burden would this put on Flickr?

Not only are they having to deal with a cache that stores, and regularly expires
so as to pick up changes, 2 billion photographs, they are now having to moderate
the expiration of that cache for any malicious changes.

What should they do if one of the images changes to something unacceptable?

Should they remove the image from the site? Should they add a notice saying that
it has changed but that the cache refuses to update itself? Or should they
transparently keep the stale cache entry?

> Secondly, a main thrust of this "stuff" is that people will interact with
> content differently so their response to seeing genitalia where they had been
> seeing a cute kitten is "Chris is causing me a problem" not "Flickr is causing
> me a problem".

People can be very reactionary.

Certainly not something I would be happy with as a Flickr stakeholder.

> I realize that in our current business, litigation and social environments
> that's a high hurdle to hop, but consider what autonomy really means. If I am
> autonomous, then I take the praise and the blame, and experience the
> consequences. So a change in how we manage content also requires a change in
> how we manage ourselves, as individuals and en masse.

It's really not as simple as that.

To stick with our current example, Flickr are the community gatekeepers. They
are ultimately responsible for anything they allow to appear on the site,
regardless of the technical implementation behind it. In the scenario you
describe, I'm not so sure that autonomy would be a very successful defence when
people started seeing child porn in the search results.

Apologies for using such an emotive example.

> I think you might be a little confused about how URLs work. URLs are what
> makes this whole "bank" concept roll. If I, a content owner, am concerned
> about latency, I can put my content in a low latency bank. It doesn't have to
> be on my server. If I want everything to look like it is in my server, I can
> use redirects. Or I can hire a bank that acts like a giant caching proxy in
> front of my lame little server (sounds suspiciously like a CDN, no?).

This is fine, but doesn't address my main concern.

> URLs are also one way in which  somewhere like Flickr can access
> originals (sort of like looking on a filesystem, but with a strange
> inode) to create the various versions that live under sizes/. I'm
> speculating, but I would guess that originals aren't viewed all that
> much, it's the flickr created versions that are _presented_ most.

So, for things to work properly:

  * Flickr would need to develop a custom cache

  * The cache must handle over two billion images

  * The cache must be spread over a CDN

  * The cache must refresh all of those images every few days or so

    * This is VERY different to how most caches invalidate resources

  * The cache must checking for malicious updates

  * For every image, the cache must generate a set of resized images

That would a significant amount of ongoing effort to maintain.

How much would it cost in bandwidth to refresh 2 billion images every few days?

What financial incentive does Flickr have to make this effort worthwhile?

What benefit are the users afforded over the status quo?

Are those benefits worth having to pay for a third-party hosting service?

> You're right, validation would need to be done, but this is actually one
> way in which HTTP is really good and totally under-utilized. And if we
> really want to get in the nitty gritty here (which seems a bit
> premature) it is quite likely that whenever I "publish" something to a
> service, it is via a notification, likely over something like XMPP but I
> don't see why it couldn't be over HTTP. When I change that thing I will
> send an authentic "force a cache invalidate on URL X" message to the
> various presentation services to which I have published.

You just increased the complexity of the system tenfold.

> Or to put it another way: In some sense you're saying this won't happen
> because presentation services won't be able to do their job the way
> they've always done. I'm saying: Yeah, and? We don't have a ton of
> longitudinal evidence that supports that the way they are doing things
> now is correct.

My main argument is that commercial sites don't want to put the quality of
service into the hands of its users, and the technical measures needed to
mitigate that concern are probably too ambitious to be worthwhile.

Best,

-- 
Noah Slater, http://tumbolia.org/nslater
_______________________________________________
Discuss mailing list
[email protected]
http://lists.autonomo.us/mailman/listinfo/discuss

Re: [Autonomo.us] Inverting the Web, or an Internet Bank of Content

Reply via email to