At 06:23 PM 10/2/2002, Greg Stein wrote:
>On Tue, Oct 01, 2002 at 07:10:32PM -0500, William A. Rowe, Jr. wrote:
>> At 05:19 PM 10/1/2002, Greg Stein wrote:
>>...
>> >As long as it is understood that only *one* thing can consume the request
>> >body. Then the question arises: how do you arbitrate that? It would be nice
>> >to simply say "the handler is the consume" but that doesn't solve the case
>> >of a PHP script stored in an SVN repository. It is almost like we need some
>> >concept of stages leading up to request [body] processing and content
>> >generation.
>> 
>> Wrong.
>
>Boy... not very subtle, are you? :-)

That's not something I'm accused of very often... no :-)

>> Multiple things can access the properties from the request.
>> Consider some form variables.  One filter might be transforming on
>> variable X, while another performs some transformation on variable Z.
>> And they use the same storage for all small bits of the POST.
>
>Well... if by "properties" you mean the headers: sure. But only one person
>understands the POST body, and only one should be reading it.

No, there are several well defined content-types for POST bodies, all
of which define variables, or properties.  application/x-www-form-urlencoded,
multipart/form-data or application/xml are all reasonably well defined.

Others would clearly be considered 'not available' in unpacked form.
While the filters could all serialize and review the contents, the majority
of applications would probably not be interested in the contents, and
the others must be prepared to decipher the obscure client body.

>> In the case of upload requests, one consumer must 'sign up' to stream
>> the file, and provide a method for retrieving the contents if anyone else
>> cares {for the duration of the request.}  In theory, that consumer is the
>> script that wants to persist the POSTed file upload.  If nobody wants
>> to capture a huge POST body, we may end up with a tmpfile method,
>> but if the file will be persisted, there is no reason to stow it twice.
>
>This seems to assume that Apache [core] is processing POST bodies in some
>way, and then providing APIs for other modules/scripts to access the
>processed data.

No, we are suggesting that apreq be built into httpd.  At the same time,
the Apache apreq filter would ONLY process the client's body if an apreq
consumer of this specific request injects the apreq filter (through some
simple API) into the request process prior to the handler processing
the request.  Multiple calls to inject the apreq filter would resolve to a
single instance of the apreq filter and decomposition.

>I'd be very leery of automatic POST body processing in any form. Providing
>standard functions? Sure. And that is what libapreq is about, right? :-)

Automatic?  Only by request.  If a tree falls and noone is listening, any
sound it made is wasted.  Same with apreq ... I sure don't want to waste
memory or cpu breaking apart POST bodies with no consumers :-)

>> >But they are two modes of operation. In one, you generate the original
>> >content (e.g. a PROPFIND or GET response against a database), in the 
>> >other you're filtering content.
>> 
>> In both cases you are transforming a PHP script into the resulting
>> output from processing the script.  No difference, truly.
>
>At a technical level, no, not really. But the problem is that the filter
>stack is about filtering the *content*. If the *script* is travelling
>through the filter chain, then you've got issues :-)

I don't really see any issues.  Please elaborate :-)

>The workaround to that is to always ensure that the script processor is the
>first filter on the chain. But that seems kind of hacky... you could end up
>with some contention there, with filters trying to jocky their way into
>"first". What if a filter got itself in front of PHP? Uh oh. (yes, it would
>be fine if the filter knows it is processing a script, but if it *doesn't...)

Why 'first'?  Perhaps I have a mixture of javascript and php in a composite
document.  Perhaps I want one, then the other to process the data.  In any
case, I (the admin) should be able to order these filters.  Right now we are
really not there in terms of ordering, but we all agree that's got to be fixed.

>As an example of fighting for first: the OLD_WRITE filter. How do we
>arbitrate that "first" spot between PHP (or some other script processor) and
>OLD_WRITE?
>
>That is why I suggested there might be a need for some kind of staged
>process where we manage non-content data.

If we want to have a specific domain for 'parser/scripts' in the stack, just
like we have request and connection oriented filters today, that would be
fine by me.  See my comments on ordering above.

>>...
>> >> And that said, you can't break POST to the default handler, please
>> >> revert that change.
>> >
>> >The POST behavior before my fix was a regression against 1.3. Its existence
>> >was to support the PHP-processing-as-a-filter implementation. The impact of
>> >adding the POST "feature" to the default handler was never really detected
>> >until we uncovered it via a bug in mod_dav. Since this regression was found,
>> >I went in and fixed it. Unfortunately, that breaks PHP :-) But I think the
>> >PHP integration is broken and needs to be fixed.
>> >
>> >While new PHP integration is being solved, we can *temporarily* revert the
>> >change so that PHP users aren't inconvenienced while the integration is
>> >worked on. But the default_handler fix does need to go in to prevent the
>> >regression.
>> >
>> >(I term it a regression because 1.3 didn't do it, and POSTing to something
>> > which isn't going to handle the posted content is not right; we should be
>> > generating an error that states the POST is not allowed for that resource;
>> > that is what 405 is all about)
>> 
>> We should continue this discussion after the change is reverted,
>> and let's find a better answer.
>
>The default handler shouldn't be serving up the file for a POST. The
>original change that did that was wrong.

The RFC leaves it undefined.  Your religion or mine?  (Or rbb's?)

Here's where I agree.  If no module injects that ->accept aught to include
POST, then POST should be declined by the default handler.  But if any
module raises it's hand and says, YES, I want to accept POST requests
on these 'files', then the default handler should concur.  So PHP could
inject the POST method into accepted methods, and the default handler
should respect that.

And the same philosophy should apply to any method that a module wants
to apply against the default handler.  The default handler is simply a simple
endpoint for any filtering designs that originate from a file.

>> Going back to the apreq input filter, I can see where it would share
>> with the core if anyone registered to review the client body.  If someone
>> does, the handler could be toggled to accept POST.  If nobody registers,
>> then we could decline POST.
>
>I like this.

So do I :-)

>However, it really shouldn't be tied to POST, but should work for any
>method. And it would seem to be a new request_rec field. However, to avoid
>an MMN bump, I might suggest that we put the flag into the core config
>record (core_request_config structure) and an API to set the flag.

Same page here :-)  I'm suggesting the r->accept field might work fine.

>So the next question then becomes: what to call the API function? Maybe
>ap_method_requires_script()? It basically says "normally, this method does
>not return the resource as content, but I require it because I'll interpret
>it as a script to process this method." Script processors which are
>installed as filters would call it. We would also recommend that the
>processors have some way to configure which methods the script is actually
>intended to handle, and call the function based on whether the script will
>truly handle it (and if not, then the default behavior will return the
>normal 405 (Method Not Allowed)).

See also the concepts of AcceptPathInfo... We are rather in the same place
with the PATH_INFO concept and POST and even client bodies against GET.

>I'll go fix the code to introduce the flag, and I'll default the flag (for
>now) to provide the old behavior. Once we get a function name and add the
>function, then we can change the default and have the PHP filter call the
>thing to enable delivering scripts into the filter stack.

Again, I don't intend to incorporate this into .43 - please proceed as you will,
and let's work through all these design issues.  My weekend is set aside to
finally provide some of the plug-ins for apreq in preparation for porting it as
a filter ;-)

Bill


Reply via email to