Absolutely agree. For a fixed file, some hash of the file content can serve as the globally unique identifier for that content. I believe that some "content integrator", which fetch the same content from multiple sources (e.g., http/ftp, BT, eMule, their own network), search and index contents across networks this way. Of course, this is not discovery of content, but discovery of a content of a particular encoding (e.g., same picture but one using jpg and the other png will not be considered as the same "content"). Do people know of other robust, scalable content discovery techniques?
Richard On Fri, 1 May 2009, 4:05pm -0400, Eric Burger wrote: > I would expect that do do content discovery or peer discovery one would need a > relatively fixed identifier for that content. > > On May 1, 2009, at 3:57 PM, Y. R. Yang wrote: > > > > >Hi Eric, > > > >On Fri, 1 May 2009, 3:53pm -0400, Eric Burger wrote: > > > > >I think such a proposal would require a globally unique and time invariant > > >content ID. That sounds challenging at best. > > > >I am not sure I am following. What requires a globally unique and time > >invariant content ID? > > > >Richard > > > > >On May 1, 2009, at 12:36 PM, Y. R. Yang wrote: > > > > > > > > > > >Hi All, > > > > > > > >I found the discussion related with content ID interesting. > > > > > > > >Pushing upper layer information, in particular, content ID, into ALTO, > > > >according to my perspective, is to make it possible to turn an ALTO > > > >server > > > >into a session tracker so that it may take on additional potential > > > >functions (depending on the format of the content ID) including content > > > >discovery (e.g., discover popular contents, notify caches), peer > > > >discovery > > > >(e.g., allow caches to register availability, return peers that a > > > >requesting peer does not know but ALTO server knows), and load balancing > > > >(e.g., the discussion on re-balancing). This email thread is discussing > > > >about using it for rights management. I have no problem in adding an > > > >optional Content ID. > > > > > > > >But we need to keep in mind that providing the aforementioned functions > > > >may lead to a substantially more complex ALTO server architecture and > > > >semantics. I am particularly not clear about the statement that it is > > > >better for ALTO to provide information in the context of a particular > > > >content/swarm/channel (identified by a content ID in an ALTO query). What > > > >is the semantics/meaning that the ALTO info returned is adapted to a > > > >particular content/swarm/channel? There can be many types of contents, > > > >e.g., file (BT block scheduling vs E2dk which uses a priority queue), > > > >live > > > >streaming, VoD, VoIP, game. Different applications/variants will have > > > >their specific requirements/secret sauce for constructing peer > > > >communication patterns. A query may be issued in a particular context, > > > >e.g., a seeder is looking for leechers or a leecher is looking for > > > >seeders. An application may use a lot more information (e.g., who are > > > >sources, who are seeders, upload/download capacity, buffer status, > > > >playout > > > >delay) to construct peer communication patterns. ALTO network information > > > >is just one of the many inputs. Are we talking about designing an > > > >omnipotent ALTO server that functions as a universal application tracker? > > > > > > > >To make progress, follow the end-to-end design principle, and implement > > > >modular/reusable design, I feel that we should first design the most > > > >basic, reusable ALTO component, whose function is just to provide simple, > > > >useful network information service, which is likely to be content > > > >independent. Then we can talk about more extensions. Content protection > > > >(e.g., content ID as access control token), content discovery (e.g., > > > >mapping from a content ID to a list of servers), content notification to > > > >caches, cache integration, session tracking, peer selection should be > > > >independent services, should ALTO provide some of them. The protocols > > > >(e.g., ALTO/P4P InfoExport interface descriptor) I have seen so far are > > > >quite extensible to accommodate new services. > > > > > > > >Richard > > > > > > > >On Thu, 30 Apr 2009, 3:58pm -0700, Richard Bennett wrote: > > > > > > > > >As long as ALTO is defined as a service that maps a content ID to a > > > > >collection > > > > >of paths, it seems to me that it's piracy-neutral. I'm disturbed by a > > > > >system > > > > >that is passed a collection of paths and asked to rank them, as that > > > > >would > > > > >seem to have a definite pro-piracy bias. But I agree with Nick (I > > > > >think) > > > > >that > > > > >a system that accepts a transient content ID and returns a list of > > > > >paths is > > > > >neither pro-piracy or anti-piracy. The rights management function can > > > > >be > > > > >layered on top of ALTO, as I think it should be, as a kind of > > > > >DNS-for-content > > > > >that takes some sort of textual description of the content and returns > > > > >an > > > > >identifier that ALTO can then use to guide toward the best paths. The > > > > >higher > > > > >level function - the content mapper - can be developed independent of > > > > >IETF > > > > >guidance and in accordance with some sort of deal between content > > > > >producers > > > > >and network operators. The content mapper is where the rights > > > > >management > > > > >goes, > > > > >not in ALTO. > > > > > > > > > >RB > > > > > > > > > >Nicholas Weaver wrote: > > > > > > > > > > > >On Apr 30, 2009, at 2:19 PM, DePriest, Greg (NBC Universal) wrote: > > > > > > > > > > > > >Thanks to Enrico and Nicholas for providing additional background > > > > > > >and > > > > > > >explanations. > > > > > > > > > > > > > >The key point of disagreement seems to be that adding a content > > > > > > >protection requirement to ALTO would "hugely complicate and > > > > > > >compromise > > > > > > >the design of > > > > > > >ALTO." > > > > > > > > > > > > > >I'm not an expert in such matters, have very limited exposure to > > > > > > >the > > > > > > >area, and can't help but wonder if that is, in fact, correct. > > > > > > > > > > > > > >Was there a serious investigation or did someone simply do a > > > > > > >back-of-the-envelope analysis. > > > > > > > > > > > >For me, its "Intuition backed up by a threat analysis and usage > > > > > >cases": > > > > > > > > > > > >We have legitimate uses which requires ID churn: its the only way to > > > > > >guarantee that a rebalancing is fresh. Especially since nodes churn > > > > > >all > > > > > >the > > > > > >time, and ALTO may not have notification when nodes leave. > > > > > > > > > > > >We have legitimate uses which require IDs to be arbitrary (rather > > > > > >than > > > > > >representative hashes): ALTO is not just for file distribution, but > > > > > >other > > > > > >P2P optimization (eg, optimizing for low latency for DHTs) where > > > > > >hashes > > > > > >don't have meaning. ALTO doesn't want to deal with particular P2P > > > > > >protocols, which all may have different representations of what data > > > > > >or > > > > > >blocks are. And doesn't want to deal with colliding namespaces from > > > > > >different P2P programs. Thus defining ID as a UUID or other opaque > > > > > >identifier means ALTO doesn't have to deal with these problems. > > > > > > > > > > > >We have legitimate uses which require IDs to be creatable at-will by > > > > > >any > > > > > >party: Otherwise, ALTO becomes an admission only system which limits > > > > > >utility. > > > > > > > > > > > > > > > > > >Yet all three decisions (allowing churn, opaque-data IDs, at-will ID > > > > > >creation) and there becomes an easy countermeasure to ANY system > > > > > >predicated > > > > > >on "block bad IDs", as long as that system has a slower response time > > > > > >than > > > > > >the P2P network you are trying to prevent optimizing its > > > > > >communication, > > > > > >and > > > > > >you can't do "only allow good IDs" if IDs are creatable at-will by > > > > > >any > > > > > >party. > > > > > > > > > > > >And "if a defense has a trivial countermeasure, don't bother > > > > > >deploying > > > > > >it". > > > > > > > > > > > >Thus this means the only way to make ALTO "content protecting" is to > > > > > >remove > > > > > >one of those three constraints. But all three features are very > > > > > >valuable > > > > > >in > > > > > >a localization service. > > > > > > > > > > > > > > > > > > > > > > > >Additionally, there is a large bias in the network community in > > > > > >general > > > > > >to > > > > > >be "content neutral". Any time you cease to be content neutral on > > > > > >the > > > > > >technical level, it must necessarily impose constraints and costs on > > > > > >the > > > > > >system. > > > > > > > >>>>>_______________________________________________ > > > > > >alto mailing list > > > > > >[email protected] > > > > > >https://www.ietf.org/mailman/listinfo/alto > >>>>_______________________________________________ > > > > >alto mailing list > > > > >[email protected] > > > > >https://www.ietf.org/mailman/listinfo/alto > > > > > > >>>_______________________________________________ > > > >alto mailing list > > > >[email protected] > > > >https://www.ietf.org/mailman/listinfo/alto > > > > _______________________________________________ alto mailing list [email protected] https://www.ietf.org/mailman/listinfo/alto
