Re: repository@ awareness?
Noel: Thanks for the W3C style reference. One of the subjects it deals with is content negotiation http://www.w3.org/Provider/Style/URI.html#remove - and this got me thinking about how metadata as opposed to a resoruce that metadata is describing can be resolved. I'm going to try to dig up some more on on content negotiation subject as this may be a factor in resolving some of the requirements I have. Stephen. Noel J. Bergman wrote: Stephen McConnell asked: File system - a convenient and simple solution - but should a file system driven approach be the basis for the next generation? The basis is a URI space. Whether a URI is efficiently served by a static file, or by some servlet, CGI or Grandma Moses typing very VERY fast really should not be visible to the user-agent. A solution must be implementation independent See: http://www.w3.org/Provider/Style/URI.html The URI is a request for content. It should not change, regardless of the means by which the content is generated. So why a preoccupation with meta-less file system structures as opposed to a preoccupation with an extensible repository protocol? The extensible repository protocol is HTTP. Nothing else needs to be visible. The only thing that the infrastructure team needs to deal with is the implementation of the URI space (allowing that the content addressed by a URI can vary based upon the user-agent). --- Noel -- Stephen J. McConnell mailto:[EMAIL PROTECTED]
RE: repository@ awareness?
You're saying that those interested in enabling a repo with metadata and searches based on this metadata could wrap the repository with a servlet. Could? Yes. But that is just one way of many. I maintain that httpd could serve the content of most repositories, meta-data and all, without dynamic content generation. The URI could be used by the servlet to give a different view of the repository based on [criteria embedded in the request] IMO, the request should to encode the complete request. There should not be any other implied context. the servlet manages the interaction behind the scenes with some sort of metadata database to conduct the query and return the results as if they were regular files on the server's repo file system. It depends upon the repository implementation. It could work as you describe, or there could just be pre-built metadata stored in files. Consider that eventually web sites will likely use Subversion with WebDAV as their authoring mechanism. Authorized people will post directly to a Subversion repository. Although httpd can load directly from Subversion, that will not be as efficient as serving directly from the file system. The reason for that is that sendfile() does not work directly out of a BDB database (as far as I know). Therefore, when a file is posted to Subversion, it could be mirrored by a hook to a directory representing the current content, which is what would then be served by httpd. We used a similar technique at GEIS years ago with SourceSafe, so that when a checking occurred, a copy went into a shadow directory, and a build test was initiated. Likewise, a tool could be invoke to build meta-data, and store it in the file system. So there are ways and ways and more ways. The goal is the same, as should be the externally viewable behavior. --- Noel
Re: repository@ awareness?
Stephen McConnell wrote: Noel: Thanks for the W3C style reference. One of the subjects it deals with is content negotiation http://www.w3.org/Provider/Style/URI.html#remove - and this got me thinking about how metadata as opposed to a resoruce that metadata is describing can be resolved. I'm going to try to dig up some more on on content negotiation subject as this may be a factor in resolving some of the requirements I have. The XML-DEV and various RDF related mailing lists hold discussions about this topic regularly. BTW this raises the question whether a RDF derivative or a completely self-designed XML vocabulary will be used for the repository metadata. J.Pietschmann
Re: repository@ awareness?
Leo Simons wrote: Justin Erenkrantz wrote: Do any 'core' infrastructure people need to get involved to help guide with what's practical or not? yep. But I doubt you really need to get 'deeply' involved. A half-page explanation of what resources are and are not available should be enough, don't you think? I'm probably in a minority - so don't count anything I say as an indicator of public opinion. First off - the board wants human readable safe downloading. Personally I think this objective is of minor relevance/impact to ASF in the medium term. Since early 1998, the notion of repository-aware applications has been growing. Here in Apache its in its infancy - but clearly prevalent in the Java community. Maven is an early example (hit a repository for jar downloading to resolve n build dependencies) - Avalon is another example - (hit the repository and get back a class loader hierarchy). File system - a convenient and simple solution - but should a file system driven approach be the basis for the next generation? My conclusion - no. A solution must be implementation independent - I should be able to map a protocol to a RDMS, LDAP, simplistic HTTP over file layout, even an XMI repo over IIOP if deemed appropriate. So why a preoccupation with meta-less file system structures as opposed to a preoccupation with an extensible repository protocol? Here is an example of a modern repository aware application. $ merlin http://dpml.net/avalon-http/block.xml The above command has executed the following: (a) bootstrapping of a repository client (b) resolution of repository adapter implementation (c) downloading and installation of repository adapter and dependencies (meta data) (d) bootstrapping of the repository adapter into action (meta data) (e) downloading of block.xml using the repository adapter (i.e. protocol independent) (f) validation of the downloaded artifact (meta data) (g) construction of information about block dependencies by the local app (meta data) (h) recursively downloaded artefact dependencies (meta data) (i) local creation of a class loader hierarchy based on class loader assignments (meta data) (j) created a container holding a set of composite components (k) executed the orderly deployment of supporting components (l) started a web server, and a set of business components, and a servlet First time user will trigger something in the order of about 30-40 downloads. Local system will cache information and monitor the repository for changes. Step 2 - user launches a command to manage the running servlet (a) jmx management libraries are auto-downloaded (meta data) (b) along with a dozen commons jar file (meta data) (c) management app invokes request on management agent download (d) agent is deployed in a target JVM (local deployment) (e) jnlp client completes downloading of three jar files signed using X509 certificates into a third JVM (f) applet appears in users browser (g) user updates parameters (h) updated deployment profile is sent to remote repository (meta data) (i) local client synchronizes local cache relative to remote repo (meta data) All of the above from one command and a few clicks of a mouse. Ok, I confess - we don't have of the above in place today - but do have the majority. This benefits significantly from a rigerouse protocol supporting artefact location, feature assessment (meta data), authentication, replication and validation. An argument that appears popular on repository@ is that the basic files system does not need to be meta-aware - i.e. no distinction between artefact and info-about-an-artificat. IMO it is basically a misadventure to focus so closely on subjects such as file system structure (the lowest common denominator solution). Instead should we not be defining a protocol that is a transport and implementation independent? A protocol that will enable the functional requirements of artefact authentication, artefact navigation, artefact retrieval and artefact registration. Popular arguments are that agreement on meta information associated with artefacts is not achivable - and yet the simple notion of named value pairs is a widespread abstraction. This simple notion of the artefact + information about an artefact is IMO a fundamental requirement. After all - isn't thjis 2003 - we have the technology! Surely our repository spec should enable an implementation based on a files systems, but equally, should not restrict the potential for transparent replacement with alternative more advanced and efficient solutions. Also of relavance are the economic and social impacts. A repository not capable of supporting or evolving towards forward looking repository-enabled requirements as outlined in the above scenario is destined to be redundant within a matter of a few years. Redundant because it will not be relevant to a predominant programmatic scenarios and redundant because it will not meet basic functional
Re: repository@ awareness?
Stephen McConnell wrote: [..] All of the above from one command and a few clicks of a mouse. Ok, I confess - we don't have of the above in place today - but do have the majority. This benefits significantly from a rigerouse protocol supporting artefact location, feature assessment (meta data), authentication, replication and validation. An argument that appears popular on repository@ is that the basic files system does not need to be meta-aware - i.e. no distinction between artefact and info-about-an-artificat. Stephen: Please understand that artifact's meta data is simply just another artifact. Every file which lives in repository is an artifact And we rather don't need any extra level of abstraction. Notion of the artifact + information about an artifact is already exhausted when we will clarify the notion of artifact and define repository layout for artifacts. You can have as many levels of metadata as you would like (meta data , metametametameta data and whatever else anybody will need). In maven world we have foo/jars/foo-1.0.jar /poms/foo-1.0.pom Jar is an artifact Pom is also an artifact which provides some meta information (of course not all) about Jar. You can add as many other files to to repository as you wish. There is clear distinction between artifact and info-about-an-artificat as they both will be different files in the repository (artifacts). Possibly info-about-an-artificat could be located in few files and accessed selectivly by different tools. Metadata about repository itself can be also kept in repository. Even directory listings in few different flavours for different tools can be in repository. Can you provide an explanation what exactly is not covered by such approach? [..] Also of relavance are the economic and social impacts. A repository not capable of supporting or evolving towards forward looking repository-enabled requirements as outlined in the above scenario is destined to be redundant within a matter of a few years. Redundant because it will not be relevant to a predominant programmatic scenarios and redundant because it will not meet basic functional requirements. So you want us to predict what will happen in few years :)? Again I don't understand you: You can build any abstractions you like on the top of the repository with features that were dissussed. Aren't you doing it even now when you use maven repository for storing information about your avalon services? Michal
RE: repository@ awareness?
Justin, Is anyone on infrastructure@ aware of what's going on in [EMAIL PROTECTED] Not from what I see on the subscriber list, which is why I have suggested on more than one occassion that such participation is important. Apparently, AFAICT, that list is supposed to allow for Java-based distribution of software. Other than that, I'm completely lost as to what that list is for. Eventually, it would be desirable to have a user-friendly tool that is capable of picking up, for example, httpd source, tomcat, and other parts, and doing a platform-specific install. But the tool is someone else's problem. The only thing that the repository needs to do is provide a non-fragile URI space for artifacts, of which files and, eventually, metadata are both examples. Do any 'core' infrastructure people need to get involved to help guide with what's practical or not? Yes. With a quick perusal of [EMAIL PROTECTED], I got the sense that they might be out in la-la land Agreed. The discussion on [EMAIL PROTECTED] was getting into tool areas that should be relatively orthogonal to the repository. There are three areas: - URI space - metadata - tools The first is the main issue that the repository needs to address. The second is an area where after we have decided upon the URI space, the tool groups could use the repository list as a gathering place to seek common ground. And then there are tools, which belong elsewhere, but use the repository. Some people are jumping ahead to tools before the URI space is resolved. The people advocating a file layout *only* get my uninformed +1.) I think that most people recognize that the file layout only approach to the URI space is necessary. meta-data is present in the URI space, and can be implemented with a static file. Even if we want to key off the user-agent for meta-data, that can still be served with static content in the file space. --- Noel
RE: repository@ awareness?
Stephen McConnell asked: File system - a convenient and simple solution - but should a file system driven approach be the basis for the next generation? The basis is a URI space. Whether a URI is efficiently served by a static file, or by some servlet, CGI or Grandma Moses typing very VERY fast really should not be visible to the user-agent. A solution must be implementation independent See: http://www.w3.org/Provider/Style/URI.html The URI is a request for content. It should not change, regardless of the means by which the content is generated. So why a preoccupation with meta-less file system structures as opposed to a preoccupation with an extensible repository protocol? The extensible repository protocol is HTTP. Nothing else needs to be visible. The only thing that the infrastructure team needs to deal with is the implementation of the URI space (allowing that the content addressed by a URI can vary based upon the user-agent). --- Noel