Hi, Zack. Thanks for the pointers. I'm copying the list on this so we can all have the same context in this discussion.
1. THE END-TO-END ARGUMENT -------------------------- On Thu, 31 Jul 2003, zachary rosen wrote: > http://www.wikipedia.org/wiki/End-to-end_argument [...] > A lot more can be found on Edge to Edge (also referred to as end to end) > from google. It is something that has been talked about for quite some > time. Ah -- now i see that you are talking about what is familiar to me as the "end-to-end argument". I do know the Saltzer, Reed, and Clark paper [1]. I wondered if this is what you meant by "edge-to-edge", but you seemed to be describing something so different from the end-to-end argument that i assumed you must have meant something else. I think you have misunderstood what Saltzer et al. were trying to say. Let me try to explain. The "end-to-end argument" is a design principle that has to do with deciding whether system functionality should be placed at high levels or low levels. The paper argues that functions placed at low levels may be redundant or to costly to be worth it, because the same functions often have to get reimplemented by the higher levels anyway -- because the higher levels (e.g. the application) know their own needs better. Putting functionality at a lower level amounts to making an assumption that every application will want that functionality. But if your assumption is wrong, the lower levels might waste a lot of resources trying to provide a service that the application doesn't even need. So you should rely on intelligence at the highest level (in the case of a network, the endpoints of communication) instead of getting too obsessed with the lower levels. For example, it might seem reasonable to assume that a network should always deliver error-free packets. So adding a checksum-and-retry feature to a network layer in order to guarantee accurate delivery may seem like a good idea. But there are some applications that care more about speed than accuracy -- such as voice over IP -- and these would be harmed by the inefficiency of a checksum-and-retry feature. Now let's return to our question about whether the media database should be centralized. Regardless of whether it is centralized or distributed, we are still obeying the end-to-end argument: we are not putting any smarts in the transport layer (TCP/IP); we are totally relying on smarts at the endpoints of communication (that is, the Web browser and the Web server). No one is putting in functions at a low level that are getting reimplemented at a higher level. So the end-to-end argument has no bearing on our decision at all. In particular, it is purely an efficiency argument, and it doesn't say anything about peer-to-peer networks. (Be warned, by the way, that lots of companies use the terms "end-to-end" and "peer-to-peer" because they are fashionable, not because they know what they mean.) 2. REED'S LAW ------------- > http://www.wikipedia.org/wiki/Reed%27s_law Originally, my response was going to be that Reed's Law has no effect on our decision either. Reed's Law says that the utility of a network is exponentially related to the number of participants. But it doesn't matter whether you have 5 users at site A and 5 users at site B, or just 10 users at site Z -- you still have 10 users, and utility on the order of 2^10. The utility is the same regardless of whether the database is centralized or distributed. But then i went back and read the original paper [2] and thought about it a little more. Now i've realized that Reed's Law actually argues in *favour* of a centralized database. Notice that the paper doesn't say "all networks have utility that scales exponentially in the number of participants". It refers to a specific *type* of network -- a "Group-Forming Network". In his words: A GFN has functionality that directly enables and supports affiliations (such as interest groups, clubs, meetings, communities) among subsets of its customers. Group tools and technologies (also called community tools) such as user-defined mailing lists, chat rooms, discussion groups, buddy lists, team rooms, trading rooms, user groups, market makers, and auction hosts, all have a common theme -- they allow small or large groups of network users to coalesce and to organize their communications around a common interest, issue, or goal. The reason that the utility scales exponentially is that, if N people are allowed to form and coordinate their own groups of any size, then there are 2^N possible groups that can be formed. The whole point of Reed's paper is to argue that this group-forming capability is essential and extremely powerful. As an example, he compares ordinary e-mail to mailing lists. Ordinary point-to-point e-mail connects only two people, so its utility scales by N^2 (Metcalfe's Law). But a mailing list can coordinate any number of members, so its utility scales by 2^N (Reed's Law). How does this bear on our media database? It tells us that enabling people to self-associate into groups (perhaps around individual media items, media projects, or collections of media) is crucial. And if you look at the media items themselves as participants, it's clear that the ability to gather and share collections of media is also crucial, because it can yield the same exponential effect. Giving people and media items a fixed address at one location vastly simplifies the problem of forming these groups and collections. It's much harder to find other users and media items scattered across many different sites than at one central site. (This is why we are building VV!) And it's much harder to coordinate and update a collection containing items scattered across many sites than at one central site. Reed's Law argues that our media network must be a Group-Forming Network. To form these groups, we have to link media items and people together and to each other. As i explained in our IRC discussion, this is easy to do if the database is centralized and very complicated otherwise. We have a candidate to get into office, an opposing $200 million campaign about to wage an all-out war on us, and no time to waste. I favour the simpler solution. -- ?!ng [1] Saltzer, Reed, and Clark. End-to-end Arguments in System Design. http://web.mit.edu/Saltzer/www/publications/endtoend/endtoend.txt [2] Reed. That Sneaky Exponential -- Beyond Metcalfe's Law to the Power of Community Building. http://www.reed.com/Papers/GFN/reedslaw.html