On Mon, May 07, 2007 at 05:51:45AM +0000, Lalo Martins wrote: > Aaron Bentley posted to the bzr list about a Van Jacobson talk: > > I was watching this talk by Van Jacobson about a new networking > > paradigm, and I started going "hey, I know this stuff". > > > > http://video.google.com/videoplay?docid=-6972678839686672840&hl=en > > > > Around 37:31, he starts talking about a new dissemination mechanism in > > which you look for named data, rather than having conversations with > > servers. > > I can't actually *watch* the talk, though, as stupid google video doesn't > work in China. If anyone is interested, can you please watch, and post a > summary? In particular, how much it's relevant to the way we're already > doing things ("named data" sounds a lot like "vobject" from my chair).
Summary: First 40 or so minutes explaining why networks up until now evolved the way they did. Circuit-oriented telephone networks evolved the way they did due to specific ways the underlying circut switching technology worked (going back to human operators working a switchboard!). Packet-switched networks were revolutionary because, unlike the phone system, they were agnostic to the underlying transport medium. TCP/IP was designed for point-to-point communication based on the assumption that the primary use of data networks would still be for point-to-point "conversations". Also, TCP/IP was designed in an environment where each computer had many users, by constrast with today, where you have many computers per user. The second part of the talk describes where we are today, and how networks can be adapted to make it better. Modern Internet usage has evolved such that the vast majority of traffic is better described as broadcast traffic rather than point-to-point: publishing web pages, streaming video, file sharing, even email in the case of mailing lists. This is very inefficient if many users are requesting the same data at once. Another problem posed by current architechtures are the challenges of data synchronization between devices, which can also be traced to the fact that devices are often required to synchronize on a peer-to-peer basis, rather than having a mechanism to broadcast changes to other devices. The proposed solution is a bit light on details, but big on ideas: to deal with problems of scale in the age of Internet publishing, we step away from our notions of purely fixed-address, point-to-point communication, and consider that in many cases, it is highly desirable to be able to automatically replicate and propagate that data. In the example given, when you access the New York Times (newspaper) front page, you shouldn't care whether the actual data you get is served from the NYT web server, or from some other downstream server that has a copy -- provided you can verify that it originated from the NYT by checking the digital signature. One significant idea mentioned was that, in the way that TCP/IP abstracts the underlying physical transport layer, such a system ought to be abstracted from the protocol layer -- so that data can be propagated by whatever physical or virtual means are most appropriate or available. He points to Gnutella and Bittorrent as examples of trends in this direction. Each system demonstrates the two key properties of this type of approach, that once something is published and replicated a few times it may stay in the network even if the original source is no longer available, and that popular resources are inherently load balanced by virtue of the fact that the more people access a resource the more intermediate servers will have a copy. Unfortunately he didn't seem to mention Freenet (http://freenetproject.org), which to my knowledge is the most complete implementation of many of the ideas he's promoting. Commentary: This talk is primarily aimed at spurring people to do more research in this area. For this reason, it poses many questions but provides few concrete answers as to how such a system would be put together in practice. He helpfully separates it out into the "easy stuff" (problems for which reasonable solutions already exist) and the "hard stuff" (everything else). He doesn't really touch on the highly dynamic nature of current web sites. When every user is served a custom web site, complete with widgets and ads personalized to their zip code, it's much more difficult to replicate in a useful way. Of course media (sound, images, video, maybe 3D meshes later on) are usually not (yet) dynamically generated, and account for quite a lot of bandwidth, so there are still gains to be made there. Resources like HTML pages could also be divided up into finer grained representations that distinguish static and dynamic elements. He does mention that timestamps and versioning would need to be an inherent part of this system so that published resources can be updated. It's worth noting that a key difference from caching seems to be that this would be inherently a "push" system -- when you publish something, you go and bang on the doors of nearby hosts and ask them to pretty please replicate your data and pass it on if they know of any other hosts that might be interested. This is interesting, because this kind of "push/flood" system ends up being similar to store-and-forward message routing as new data is directed through several hops to eventually reach every host that's expressed an interest in that data. How this might influence VOS: Replication, migration and versioning are essential for long-term scalability of a distributed system like VOS, and that VOS is in many ways a great example of a "data dissemination" system that he talks about. Something I've also come around to realize is that some notion of time in the system is critical, and that "time" and "versioning" are fairly closely related concepts when describing a series of changes to a particular resource. So, it is useful to think about how the s5 design will accomodate object replication and migration and their relationship to time and versioning. Something that we also need to consider is the fact that vobjects are both declarative (well defined data fields, not opaque) and computational objects. Replicating data is relatively straightforward, but what about computation? I can think of at least three cases when making a call on a replicated object: - No replicated computation: no chance for local processing, always send a message to the master vobject. Example: talk messages. - Predictive computation: send a message, try to guess the result but there's a chance we'll be overruled. Example: movement interpolation, physics. - Deterministic computation: the behavior will have effectively identical outcome whether run in the replication local replication or the master vobject. Example: a mouse rollover graphic effect. To really support replication in the presence of versioning, the vobject "descriptor" needs to incorporate time and versioning to get a fully qualified vobject identifier. suitable for replication and caching (including routing and security bits) might include: * site id * vobject id * embedded child id * last modification time * version number * capability key * hash code So, Lalo, this is probably a bit more than you expected :-) I think the answer to your question ("could VOS be useful for the things Van Jacobson talks about") is yes, if we incorporate a robust notion of time and version as related to state changes. If anyone thinks this is fanciful, this actually cuts right to the core of how remote vobjects work, and how we eventually handle caching -- central issues to the s5 redesign. -- [ Peter Amstutz ][ [EMAIL PROTECTED] ][ [EMAIL PROTECTED] ] [Lead Programmer][Interreality Project][Virtual Reality for the Internet] [ VOS: Next Generation Internet Communication][ http://interreality.org ] [ http://interreality.org/~tetron ][ pgpkey: pgpkeys.mit.edu 18C21DF7 ]
signature.asc
Description: Digital signature
_______________________________________________ vos-d mailing list vos-d@interreality.org http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d