Gianugo Rabellino wrote: > I have a basic concern here. URLs are, as the name suggests *locators* > or *identifiers*. The idea is that via a URL you can locate (identify) > data and fetch them: they were not designed to handle the opposite case > where you have to send data to them. The HTTP POST is a workaround which > is HTTP specific and goes way beyon the URL concept: there is no way to > express in the URL syntax the *direction* of the data flow. And if you > can't tell, looking at a URI, if it's "read" or "write" you will end up > with troubles using it in an intermixed way.
Great analysis. In fact, I believe that the sense of "outward biasing" of the Cocoon internal pipelines if a reflection of this lack of direction information for URIs. Now, I believe that URI should *NOT* have any direction information because it's up to another concern islands to come up with this (like HTTP does). For example, I was impressed by the elegance of the first servlet I saw that used the toGet() method to generate the form, the doPost() method to process it and a doError() method to generate the form with error indications (called by the doPost() method directly). [I want this elegance to be percepted from the statemap as well!] > What can be done, of course, is to use the URL to lookup a resource and > operate on the result (getting an OutputStream or an XmlConsumer to > write or send events to). This is easy for existing resources. But what > happens when you get ResourceNotFoundException? Should you pass the > error or just create a new (empty) resource with the name given as the > URI? I think that this is an arbitrary decision that has nothing to do > with the URL concept, and this kind of scares me. Ok, let's go top-down on my wish list [: 1) indicate on what "resource" we want to work on. Note: "resource" is a much more neutral term than "source" or "destination" since it doesn't convey a meaning of flow direction, but just a "location", an identifier. Here, the URI is simply perfect, but should be used *only* to indicate the resource. Placing behavioral information overlaps concerns. URI uri = new URI("protocol://host/path/resource"); Resource resource = ResourceDiscovery.getResource(uri); 2) indicate what we want to do with this resource. Note: if a resource is behaviorally-neutral, we must specify the behavior we want to interact with this resource. Here, the concept of HTTP actions is the example. resource.setAction(Resource.WRITE); 3) obtain the required connectors. resource.getOutputStream(); or resource.getContentHandler(); - o - The above augments the java.net design patterns with explicit behavioral additions. The problem is that both step 2 and 3 are behavior-dependant, for example URI uri = new URI("cvs://cvs.apache.org/xml-cocoon/README"); Resource resource = ResourceDiscovery.getResource(uri); resource.setAction(Resource.READ); ((CVSResource) resource).fromBranch("xml-cocoon2"); outputHandler = resource.getContentHandler(); but this requires casting. The following does not URI uri = new URI("cvs://cvs.apache.org/xml-cocoon/README?fromBranch='xml-cocoon2'"); Resource resource = ResourceDiscovery.getResource(uri); resource.setAction(Resource.READ); ContentHandler outputHandler = resource.getContentHandler(); but could generate IllegalStateExceptions if a ContentHandler is *set* on a READ action. In fact, the behavior is automatically assumed by the call to the connector (since the connector *does* convey direction information). URI uri = new URI("cvs://cvs.apache.org/xml-cocoon/README?fromBranch='xml-cocoon2'"); Resource resource = ResourceDiscovery.getResource(uri); ContentHandler outputHandler = resource.getContentHandler(); Is this enough? The uniform syntax of URI allows for completely transparent polymorphic behavior (or, at least, it seems so). There are URI-based descriptors for a bunch of protocol interfaces (IMAP, POP, addressbook, file, ftp, http, etc..). Of course, there are protocol handlers that *must* throw exceptions if some behaviors are not implementable. for example, the following should throw an exception: URI uri = new URI("smtp://mail.myhost.com/"); Resource resource = ResourceDiscovery.getResource(uri); InputStream is = resource.getInputStream(); <--- throws exception! because you can't read from an smtp resource. Anyway, the Resource interface should have: interface Resource { // Writable connectors OutputStream getOutputStream() throws InvalidMethodException; Writer getWriter() throws InvalidMethodException; ContentHandler getContentHandler() throws InvalidMethodException; // Writable connectors InputStream getInputStream() throws InvalidMethodException; Reader getReader() throws InvalidMethodException; void setContentHandler(ContentHandler) throws InvalidMethodException; } We could come up with some Monitorizable interface to add monitoring capabilities that connect to the cache. The URI schemes I find useful for Cocoon are: 1) file: -> obviously 2) dbxml: -> obvious again 3) http: -> reading is done thru GET, writing thru PUT or POST 4) webdav: -> [should this be different from HTTP?] 5) cvs: -> would allow Cocoon to generate the documentation directly out of CVS. A plus when you don't have much local storage capacity (say on diskless embedded system, but maybe this is FS) 6) ftp: -> nobody uses FTP nowadays, but legacy systems do. 7) imap: -> would be killer for cocoon-based webmail application (I know Gianugo was thinking about implementing this) but probably a direct javamail interface would be much more useful. 8) smtp: -> would allow us to serialize a pipeline on email. Might be useful or might be FS, see above. what I don't really find useful are: 1) sql -> how do you map a table with a path? 2) ldap -> yeah, the tree-like directory structure appears appealing, but how would you save an XML file into an LDAP tree? fragment the entire document into nodes and store those? bah, don't find it very relevant cocoon specific stuff: 1) resource: -> gets stuff from the current classpath [might not be that useful once we have the others below, but it doesn't hurt to have it. Obviously, writing methods are illegal] 2) cocoon: -> get stuff from Cocoon-served space. It would be killer to have both internal reading (as for content aggregation) and writing (as for content dissassembly [the opposite of aggregation]: storing different namespaces on different locations abstracting from the way they are implemented as pipelines) I like the power that a bidirectional 'cocoon:' protocol would give us: as content can ge aggregated from different sources, even internal ones (layered I/O is a feature that took years for the Apache 2.0 project to implement and still they pass unstructured byte streams between modules, making this virtually useless for SAX-based component pipelines), we could now have a way to "disassemble" content on different locations. [that might require a new sitemap semantic, as <map:disassemble>, but that would be backcompatible since it's an addition] Anyway, as reading from a cocoon: protocol identified resource allows for more solid contracts to be defined (creating a level of indirection that can be used to change underlying implementations without having to change the rest), writing would be equivalently powerful. For example, URI uri = new URI("cocoon://storage/" + relative_path); Resource resource = ResourceDiscovery.getResource(uri); setContentHandler(this.handler); would allow to write information in a logical location, completely detached from the physical implementation of the storing phase. Then, we could have an internal-only sitemap associated with the "/storage" URI. Now, this appears as a cool concept but we have an impedence mismatch that rings a bell: 1) for outward flow we have g1 -> t1 -> t2 -> s ^ | t3 ^ | g2 where the internal serializer is removed because useless. b) for inward flow we would have g -> t1 -> t2 -> s1 | v t3 -> store | v s2 where the second generator is removed because useless. Ok, but what is the second serializer doing? we could reshape it as g -> t1 -> t2 -> s1 | v t3 | v s2 where is the "serializer" to actually perform the storage. This has an interesting result on 'regular' pipelines: shouldn't data in -> g => t1 => t2 => s => data out |^ v| store where '->' is a bynary stream and '=>' a SAX stream be reshaped as data in -> g1 => t1 => store -> g2 => t2 => s -> data out where "store" is now a full blown serializer? This would allow the pipelines to be more "symmetrical". Bah, anyway, it turned out to be an RT. Let's see what you think about this. -- Stefano Mazzocchi One must still have chaos in oneself to be able to give birth to a dancing star. <[EMAIL PROTECTED]> Friedrich Nietzsche -------------------------------------------------------------------- --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]