One thing to point out as well is that by using mime type and supporting both htm and html extensions can have serious impacts, according to our seo team, on search engine rankings as engines may treat such pages as different thus causing duplicate content penalties.
Paddy Sent via BlackBerry from T-Mobile -----Original Message----- From: "David Nuescheler" <[EMAIL PROTECTED]> Date: Sun, 18 Nov 2007 18:40:05 To:[email protected] Subject: Re: microsling: extension vs. mime-type script resolution Hi Felix, thanks for shedding some more light on the rationale. > Actually, the resolution is based on the request url extension, only > that there is some kind of an indirection, others call this > abstraction :-) The problem is, that the extension is not all the times > unique. For example, there are people using .htm and others that are > using .html but all of them mean text/html. So instead of requiring two > scripts for .htm and .html, we just require html (for text/html). I still don't understand why anyone would require two scripts. In the end it is not up to the surfer to just decide to choose their extension. Much like in the static webserver case where the webmaster decides which extension to use either .html or .htm for his static files the person that builds microsling website would decide how their extensions are mapped. I can't even come up with a usecase where a "magic" .html and .htm extension equivalence would add value. Actually in our current version of our WCMS we also offer flexibility on the extension (interestingly enough back then with the exact same .html & .htm issue in mind) but after ~10 years of using it, I have to say, there really is absolutely no need for it, and I would never be in favour of something like that. > It gets of course more interesting in the case where there is no request > extension: What script should we select ? Currently, we default to > text/plain, which is probably not the correct solution. I think that just illustrates the issue at hand in a very clear way. My expectation was that if I have no extension, I would map no script at all. (well, to the GET.esp script, if I really needed to, which in most cases should stay untouched and I would really encourage to have the standard WebDAV GET to be executed, which in my mind is perfect in most cases) > > I think that the additional indirection to go through a mime-type table adds > > complexity (since it is yet another mechanism) and is questionable to begin > > with since somehow it seems to tie the response mime-type into the request > > behaviour for no apparent reason. > In fact there is one reason, and I think this reason validates it all: > The client tells by means of the request, what he or she expects. So in > the same way as the request URL indicates the desired content to be > delivered, the response mime-type is derived from (a) the client request > URL and (b) to the HTTP Accept header (not implemented in microsling, > yet to be done in Sling) to indicate the expected response content type. > This is no tie-in, just trying to meet up with the client's expectation. Personally I am afraid to have accept-header influence the response mime-type. Frankly, in real live I have never seen something like that implemented and I would be scared to only anticipate the effects on proxies and reverse proxies. Also I think that clients are not very specific about that. I really would recommend to serve up resources that differ in mimetype with different URLs. Personally, I would probably stay away from building any behaviors on accept headers. > > I know of a lot of cases where the > > same response script offers responses in various (not limited) mime-types, > > based on what the user uploaded to the repository. > Well, this is a delicate issue: If as a client I request something with > extension .txt I do expect a plain text response and not a GIF and I do > absolutely not care what some other guy might have stored in the > repository. If of course there is not plain text representation of the > addressed content, I would expect an appropriate HTTP status code. Some clients (Firefox) base their mime-type handling strictly on the response content-type header, others use a mix of extension, actual content and headers (IE). So I agree that it is very good practice if the content-type header, the extension and actual content match up. No doubt. I am not talking about extension to mime-type mappings like txt or html. Very frequently extensions like: .do, .jsp, .asp map to all sorts of mimetypes so the real-life web infrastructure is certainly capable of dealing with a 1-to-many relationship between extensions and mime-types. Well, let's look at the following examples: Assume that /myuploads is a node that is used to spool uploaded nt:resource nodes for one reason or another. Let's say I have jpg called my.jpg (which could also be a my.pdf ;) Now for the script resolution I would like to map .spool to my spool.esp rather than anything even vaguely related to a mime-type. /myuploads.spool/my.jpg or /myuploads.spool/my.jpg?get=1234-123423-34234-23455 > Hope, this makes my case clear for using the response content type for > the script resolution, I may not see it. To me this is just a dangerous and problematic indirection with a lot of potential for errors. Let's say my extensions-mimetype table is different from an other microsling install that I copy my scripts from then things don't work out as expected. Syncing up peoples mime-type tables does definitely not sound like a good plan to me. > whereas the response content type is derived from > the request extension (and in Sling in the future probably also from the > Accept HTTP header). In my mind this would be a very good default, but I would really leave this up to the script developer to override. Just like everybody else ;) regards, david
