Hi, Zack.

Thanks for the pointers.  I'm copying the list on this so we can all
have the same context in this discussion.


1. THE END-TO-END ARGUMENT
--------------------------

On Thu, 31 Jul 2003, zachary rosen wrote:
> http://www.wikipedia.org/wiki/End-to-end_argument
[...]
> A lot more can be found on Edge to Edge (also referred to as end to end)
> from google.  It is something that has been talked about for quite some
> time.

Ah -- now i see that you are talking about what is familiar to me as
the "end-to-end argument".  I do know the Saltzer, Reed, and Clark
paper [1].  I wondered if this is what you meant by "edge-to-edge",
but you seemed to be describing something so different from the
end-to-end argument that i assumed you must have meant something else.

I think you have misunderstood what Saltzer et al. were trying to say.
Let me try to explain.  The "end-to-end argument" is a design principle
that has to do with deciding whether system functionality should be
placed at high levels or low levels.  The paper argues that functions
placed at low levels may be redundant or to costly to be worth it,
because the same functions often have to get reimplemented by the higher
levels anyway -- because the higher levels (e.g. the application) know
their own needs better.

Putting functionality at a lower level amounts to making an assumption
that every application will want that functionality.  But if your
assumption is wrong, the lower levels might waste a lot of resources
trying to provide a service that the application doesn't even need.
So you should rely on intelligence at the highest level (in the case
of a network, the endpoints of communication) instead of getting too
obsessed with the lower levels.

For example, it might seem reasonable to assume that a network should
always deliver error-free packets.  So adding a checksum-and-retry
feature to a network layer in order to guarantee accurate delivery may
seem like a good idea.  But there are some applications that care more
about speed than accuracy -- such as voice over IP -- and these would
be harmed by the inefficiency of a checksum-and-retry feature.

Now let's return to our question about whether the media database
should be centralized.  Regardless of whether it is centralized or
distributed, we are still obeying the end-to-end argument: we are not
putting any smarts in the transport layer (TCP/IP); we are totally
relying on smarts at the endpoints of communication (that is, the Web
browser and the Web server).  No one is putting in functions at a low
level that are getting reimplemented at a higher level.

So the end-to-end argument has no bearing on our decision at all.
In particular, it is purely an efficiency argument, and it doesn't
say anything about peer-to-peer networks.  (Be warned, by the way,
that lots of companies use the terms "end-to-end" and "peer-to-peer"
because they are fashionable, not because they know what they mean.)


2. REED'S LAW
-------------

> http://www.wikipedia.org/wiki/Reed%27s_law

Originally, my response was going to be that Reed's Law has no effect
on our decision either.  Reed's Law says that the utility of a network
is exponentially related to the number of participants.  But it doesn't
matter whether you have 5 users at site A and 5 users at site B, or
just 10 users at site Z -- you still have 10 users, and utility on the
order of 2^10.  The utility is the same regardless of whether the
database is centralized or distributed.

But then i went back and read the original paper [2] and thought about
it a little more.  Now i've realized that Reed's Law actually argues
in *favour* of a centralized database.

Notice that the paper doesn't say "all networks have utility that
scales exponentially in the number of participants".  It refers to a
specific *type* of network -- a "Group-Forming Network".  In his words:

    A GFN has functionality that directly enables and supports
    affiliations (such as interest groups, clubs, meetings,
    communities) among subsets of its customers.  Group tools and
    technologies (also called community tools) such as user-defined
    mailing lists, chat rooms, discussion groups, buddy lists, team
    rooms, trading rooms, user groups, market makers, and auction
    hosts, all have a common theme -- they allow small or large
    groups of network users to coalesce and to organize their
    communications around a common interest, issue, or goal.

The reason that the utility scales exponentially is that, if N people
are allowed to form and coordinate their own groups of any size, then
there are 2^N possible groups that can be formed.  The whole point of
Reed's paper is to argue that this group-forming capability is
essential and extremely powerful.  As an example, he compares ordinary
e-mail to mailing lists.  Ordinary point-to-point e-mail connects only
two people, so its utility scales by N^2 (Metcalfe's Law).  But a
mailing list can coordinate any number of members, so its utility
scales by 2^N (Reed's Law).

How does this bear on our media database?  It tells us that enabling
people to self-associate into groups (perhaps around individual media
items, media projects, or collections of media) is crucial.  And if
you look at the media items themselves as participants, it's clear
that the ability to gather and share collections of media is also
crucial, because it can yield the same exponential effect.

Giving people and media items a fixed address at one location vastly
simplifies the problem of forming these groups and collections.  It's
much harder to find other users and media items scattered across many
different sites than at one central site.  (This is why we are building
VV!)  And it's much harder to coordinate and update a collection
containing items scattered across many sites than at one central site.

Reed's Law argues that our media network must be a Group-Forming Network.
To form these groups, we have to link media items and people together
and to each other.  As i explained in our IRC discussion, this is easy
to do if the database is centralized and very complicated otherwise.

We have a candidate to get into office, an opposing $200 million
campaign about to wage an all-out war on us, and no time to waste.
I favour the simpler solution.


-- ?!ng


[1] Saltzer, Reed, and Clark.  End-to-end Arguments in System Design.
    http://web.mit.edu/Saltzer/www/publications/endtoend/endtoend.txt

[2] Reed.  That Sneaky Exponential -- Beyond Metcalfe's Law to the Power
    of Community Building.  http://www.reed.com/Papers/GFN/reedslaw.html

Reply via email to