I thought it might be useful if I outlined... Basic principles for long-term development (targeted as release 1.0):
- a metabase represents a "zone of access", i.e. a set of resources that *can* be accessed together (presumably because they are stored "in the same place" as the metabase), and that a user *wants* to access together (e.g. one metabase might represent a "sandbox" in which a developer prototypes and initially tests a set of resources, but doesn't want them visible to any other processes or users). - a user would typically add resources to a test metabase (i.e. a metabase not in his usual PYGRDATAPATH), and later "publish" them to his personal metabase, a "group" metabase (for his co-workers), and finally to a public metabase (accessible to the internet in general). Thus, copying resource info from one metabase to another becomes a system for publishing data. - we would aim to make this copying process totally automatic and transparent, for both remote access (i.e. a server that accepts queries from remote clients) or fully transferring data to a user's local filesystem (in the spirit of the current download=True mechanism). - there would be a DNS-like system for finding "the nearest available instance" of resources in the public namespace. - just like an organization or group controls its subdomain in the DNS address space (i.e. they control what names get added to that subdomain, and what each name maps to), they could "own" a piece of the pygr.Data namespace (e.g. the Santa Cruz Genome center would control the subdomain Bio.MSA.UCSC), and would publish resources into that subdomain. Initially those resources would physically live only on the site where they were originally published, but as more people requested those resources to be pulled to their own servers for high speed access, popular resources would automatically get distributed to many sites around the world, which would then serve both requests to use them (by remote clients) and to download them to users' local filesystems. - Obviously all this needs to be secure, using the well-established framework of public key signatures and GPG-style networks of trust. That should be implemented at the basic level, i.e. pickles should be signed and verifiable. With that infrastructure in place, *code* can also be published this way, i.e. there would be a "pygr.Code" namespace representing both APIs and implementations. A given dataset (content) would specify an interface required for opening it; that interface would have a default implementation (or a user could specify they want a different implementation as an "alias" in their metabase). If the user already has that module, it gets used in the usual pickle way. If he doesn't, the code gets pulled from "the nearest available instance" in the usual metabase-DNS way, its signature verified, and checked against the user's network of trust. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---
