Versioning yes, but also vetting and revetting of sources. The further you get from original sources in any communication system, the more noise you incur without adequate checks. Shannon 101. Names alone won't do it.
I put a trivia test at my personal blog just for a "Do you trust Google and Wikipedia" test. The problem is one of not starting from an authenticated or original source. If you start from wikipedia to answer those questions without the original source, you will get about half of them wrong or near wrong. Modern Internet traffic worries about efficiency but typically the data is short lived. If you live where I live you get to watch a fascinating change: NASA is hiring as many sixty and even 70 plus year old engineers as they can find if they have actual J2 series engine experience. The original sources and digital systems failed to keep enough documents alive. They have the designs but like the Canadians who tried to rebuild the V2 engines for their contest submission, they don't know how to run them and it turns out the devil is really in the details. len -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Amstutz Summary: First 40 or so minutes explaining why networks up until now evolved the way they did. Circuit-oriented telephone networks evolved the way they did due to specific ways the underlying circut switching technology worked (going back to human operators working a switchboard!). Packet-switched networks were revolutionary because, unlike the phone system, they were agnostic to the underlying transport medium. TCP/IP was designed for point-to-point communication based on the assumption that the primary use of data networks would still be for point-to-point "conversations". Also, TCP/IP was designed in an environment where each computer had many users, by constrast with today, where you have many computers per user. The second part of the talk describes where we are today, and how networks can be adapted to make it better. Modern Internet usage has evolved such that the vast majority of traffic is better described as broadcast traffic rather than point-to-point: publishing web pages, streaming video, file sharing, even email in the case of mailing lists. This is very inefficient if many users are requesting the same data at once. Another problem posed by current architechtures are the challenges of data synchronization between devices, which can also be traced to the fact that devices are often required to synchronize on a peer-to-peer basis, rather than having a mechanism to broadcast changes to other devices. The proposed solution is a bit light on details, but big on ideas: to deal with problems of scale in the age of Internet publishing, we step away from our notions of purely fixed-address, point-to-point communication, and consider that in many cases, it is highly desirable to be able to automatically replicate and propagate that data. In the example given, when you access the New York Times (newspaper) front page, you shouldn't care whether the actual data you get is served from the NYT web server, or from some other downstream server that has a copy -- provided you can verify that it originated from the NYT by checking the digital signature. One significant idea mentioned was that, in the way that TCP/IP abstracts the underlying physical transport layer, such a system ought to be abstracted from the protocol layer -- so that data can be propagated by whatever physical or virtual means are most appropriate or available. He points to Gnutella and Bittorrent as examples of trends in this direction. Each system demonstrates the two key properties of this type of approach, that once something is published and replicated a few times it may stay in the network even if the original source is no longer available, and that popular resources are inherently load balanced by virtue of the fact that the more people access a resource the more intermediate servers will have a copy. Unfortunately he didn't seem to mention Freenet (http://freenetproject.org), which to my knowledge is the most complete implementation of many of the ideas he's promoting. Commentary: This talk is primarily aimed at spurring people to do more research in this area. For this reason, it poses many questions but provides few concrete answers as to how such a system would be put together in practice. He helpfully separates it out into the "easy stuff" (problems for which reasonable solutions already exist) and the "hard stuff" (everything else). He doesn't really touch on the highly dynamic nature of current web sites. When every user is served a custom web site, complete with widgets and ads personalized to their zip code, it's much more difficult to replicate in a useful way. Of course media (sound, images, video, maybe 3D meshes later on) are usually not (yet) dynamically generated, and account for quite a lot of bandwidth, so there are still gains to be made there. Resources like HTML pages could also be divided up into finer grained representations that distinguish static and dynamic elements. He does mention that timestamps and versioning would need to be an inherent part of this system so that published resources can be updated. It's worth noting that a key difference from caching seems to be that this would be inherently a "push" system -- when you publish something, you go and bang on the doors of nearby hosts and ask them to pretty please replicate your data and pass it on if they know of any other hosts that might be interested. This is interesting, because this kind of "push/flood" system ends up being similar to store-and-forward message routing as new data is directed through several hops to eventually reach every host that's expressed an interest in that data. How this might influence VOS: Replication, migration and versioning are essential for long-term scalability of a distributed system like VOS, and that VOS is in many ways a great example of a "data dissemination" system that he talks about. Something I've also come around to realize is that some notion of time in the system is critical, and that "time" and "versioning" are fairly closely related concepts when describing a series of changes to a particular resource. So, it is useful to think about how the s5 design will accomodate object replication and migration and their relationship to time and versioning. Something that we also need to consider is the fact that vobjects are both declarative (well defined data fields, not opaque) and computational objects. Replicating data is relatively straightforward, but what about computation? I can think of at least three cases when making a call on a replicated object: - No replicated computation: no chance for local processing, always send a message to the master vobject. Example: talk messages. - Predictive computation: send a message, try to guess the result but there's a chance we'll be overruled. Example: movement interpolation, physics. - Deterministic computation: the behavior will have effectively identical outcome whether run in the replication local replication or the master vobject. Example: a mouse rollover graphic effect. To really support replication in the presence of versioning, the vobject "descriptor" needs to incorporate time and versioning to get a fully qualified vobject identifier. suitable for replication and caching (including routing and security bits) might include: * site id * vobject id * embedded child id * last modification time * version number * capability key * hash code So, Lalo, this is probably a bit more than you expected :-) I think the answer to your question ("could VOS be useful for the things Van Jacobson talks about") is yes, if we incorporate a robust notion of time and version as related to state changes. If anyone thinks this is fanciful, this actually cuts right to the core of how remote vobjects work, and how we eventually handle caching -- central issues to the s5 redesign. _______________________________________________ vos-d mailing list [email protected] http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d
