On Monday, Nov 3, 2003, at 09:52 Europe/Rome, Sylvain Wallez wrote:


Unico Hommes wrote:

<snip/>

CallFunctionNode.java ln 166/184:
// FIXME (SW) : is a flow allowed not to redirect ?

;-D

Uh (again)? I'm wondering if there's not a misunderstanding here: this FIXME is about knowing if a flowscript is allowed to terminate without stating what page it to be displayed, i.e. check if one of sendPage(), sendPageAndWait() or redirectTo() was called.

Sorry, but I don't see how this relates to HEAD, ETags et al... What was the change you proposed to do?


We were talking about the fact that it seemed impossible to serve a request without also sending an entity body along with the response. (Short of suppressing the output in the serializer which is more of hack than a solution). I thought it was allowed to call a flow function and then not send a page. But apparently was wrong. Stefano agreed that it should be legal to call a flow function that does not redirect to a page in order to cover the full range HTTP better.


Specifically we were discussing the specification of the OPTIONS method that prescribes that "the response MUST NOT include entity information other than what can be considered as communication options" which seems to exclude sending an entity body from being such a legal response.

I traced the above location as the place the code would need to be changed in order to achieve this. But I could be wrong.


Sorry to say that, but... yes, I think so ;-)


IMO, this should be handled at the pipeline level, i.e. on a HEAD request, the pipeline should be built and setup, but not executed. And this for several reasons:
- not every request is handled by flowscript
- some pipeline components set response headers, such as the i18n transformer or the browser selector.
- if we use the pipeline key as the Etag (see below), the pipeline must be built and setup to compute that key.

not really. There are many cases in WebDAV/DeltaV/DASL where an HTTP request doesn't really generate content.... but needs lots of procedural logic to take care of it.


If you think DeltaV, for example, actions such as VERSION, UPDATE, CHECKOUT, CHECKIN and so on, don't require you to say anything else than a bunch of headers.

In this case, resorting to a pipeline is clearly overkill and we would simply like to call a flowscript function that does something, sets a bunch of headers and then, simply, terminates without calling any pipeline.

I don't know if Unico is right in pointing out that location, but this is a different concern: I think the above requirement is a big one and if we don't allow this execution, we might result in extermely poor performance on webdavapps.

[I've done *extensive* tracing on how webdav/deltav/subversion works on the wire... boy, webdav *IS* verbose already and generates tons of request/responses.... it is painful to see every 404 having a few 10kb of payload... expecially when simply by browsing around, you generate tons of it for every PROPFIND]

We must realize that the world of HTTP doesn't stop at GET/POST!!!

Note that this pipeline-level handling is different from fooling the serializer by sending its output to /dev/null, since the processing chain is setup to get all required information, but not executed.

It seems like a waste of resources to me to setup a pipeline not to use it. But, I don't understand... if I have


 <match>
  <call function="blah"/>
 </match>

and then

function blah() {
  cocoon.response.setHeader("DAV:","1");
  // does *NOT* call sendPage*
}

where is the pipeline created?

Actually, this is not very different from what happens today when content is retrieved from the cache (pipeline is built and setup but not executed).

This is different. The sitemap doesn't know, in advance, that no pipeline will be called.



BTW, can someone explain me what ETags are about (read that in the http RFC a long time ago, but did not really understood at that time).


I just looked. It seems entity tags are used as cache validators, similar to Last-Modified header I guess, i.e. they encode the state of a resource entity so that clients can optimize network calls by sending along headers like If-Match, If-None-Match, If-Range, that are then be checked against the current value of the entity tag on the server. If they match (or not) the method is executed. At least that's what I got out of it.



Don't really understand what resource _entity_ means, but it looks like the pipeline cache key could be used for the ETag. What do you think?

Not really. The cache "key" is attached to a "versionable resource", the ETag is attached to the "resource entity", means that the ETag is a unique and *permanent* identifier for that particular instance of that versionable resource. It is like the "version identifier" of that particular resource.


For example, using URIs, if

http://host/path/file

is a versionable resource (means that we keep track of its version.. in DeltaV terminology, this is "put under version control"), then

http://host/path/file/1.0.000

could be its URI and it is equivalent to its ETag... no matter what happens in the future, that resource is *immutable*. This is, for example, like Subversion works. Note that ETags are very useful for proxies: an immutable resource can be cached *forever*.

ETags are also useful for the lost update problem when no locking mechanism is in place:

person A does "GET / HTTP/1.1", obtains the page and an ETag
person B does "GET / HTTP/1.1", otainss the same page and same ETag
person A modifies the page (say, in linotype)
person B modifies the page as well (say, in OpenOffice)
person A does "PUT / HTTP/1.1", with header "If-Match" and the previous ETag
[the server sees that the ETag is the same, does the saving and the ETag is modified]
person B does "PUT / HTTP/1.1", with header "If-Match" and the ETag that got originally
[the server returns a 409 CONFLICT because the ETag doesn't match]


see http://www.w3.org/1999/04/Editing/ for more info on this

[At this point, it's up to the user-agent software to know what to do]

I'm diving deeper and deeper into this stuff (also because of JSR 170) and the more I look into it, the more I think we are generally too ignorant on how HTTP really works. HTTP and friends are protocols that we use for, say, 5%... everything else is considered black magic and reinvented everytime. Normally results in massive performance and scalability limitations.

But for Doco, I'm going to spend a serious effort to make things work the way the HTTP spec says.... in order to please the HTTPD people and in order to show that, no matter what web technology you use, if you know how network architectures operate, you scale massively.

But I'm going to fill this gap and, hopefully, influence you people back ;-)

And the work on the davmap is, IMO, going to trigger a lot of interesting redesigns in the internals.

--
Stefano.



Reply via email to