On Mon, Aug 1, 2016 at 2:13 PM, Jacob Champion <champio...@gmail.com> wrote:

> All right, getting back to this after a week off. I've tried to combine
> feedback as best I can into one message.
>
> Bill, you wrote:
>
> I'm perfectly happy to translate such values to GMT for non-HTTP
>> inputs, per spec. If we are going to do so for HTTP inputs, loudly
>> scolding the errant developer in the logs seems prudent, for their own
>> longer-term benefit.
>>
>
> I suspect this is the core of the argument now. In my opinion, CGI is not
> an HTTP input -- either as far as the spec is concerned (for spec lawyering
> purposes) or in practice. It is a separate (HTTP-like) protocol, and an
> implementation detail of the server. In other words, we are not "proxying"
> to a CGI app; the CGI app is contained by the server and is providing
> inputs to the server response.
>
> That is a separate argument from "is it wise to correct GMT timestamps",
> though.
>
> 1. Why do you/reporter want HTTP applications to persist in writing code
>> which breaks between different transport providers/cgi hosting
>> environments?
>> The language has been crystal clear for 2 decades. We do a huge disservice
>> to the PHP author community to let them be idiots. Alternately, the PHP
>> SAPI itself could rectify this. (We aren't talking about non-HTTP
>> sources.)
>>
>
> I'm not sure where PHP enters the conversation. They are only one (large
> and important!) CGI producer; we're talking about our behavior with *all*
> CGI applications here.
>
> I do like your argument that we should do as little transformation as
> possible, in order to facilitate moving CGI apps between environments.
> Implementation differences are nasty for everyone. But I'm not convinced
> that ship hasn't sailed; currently, it looks like we modify outgoing CGI
> responses in order to merge headers, normalize Content-Type, and produce
> Unmodified and Precondition Failed responses.
>

Merging headers is a well-defined behavior in the HTTP spec (see my earlier
comments on RFC 3875). Three CGI header fields are CGI-specific. I don't
know that you can correlate these mandated adjustments to date formats.

There may be others I have missed, but this doesn't look like the behavior
> of a server that considers itself a transparent "passthrough" to a CGI
> application. (Isn't that what CGI-NPH is for?) But! I could definitely be
> swayed otherwise, if that's what we'd like to do moving forwards. I think
> both sides have potential value, but we should choose one.


Let's not even get into NPH scripts... heh.


> If there is date input that we cannot handle, the
>> spec strongly encourages us to interpret it as now(), provided we have a
>> clock (which all of our architectures do.)
>>
>
> In the absence of a quote from the spec, I'm still in strong disagreement
> with this, based on the language I quoted last week.
>

The spec demands future -> now() if we know now(). On other erroneous data,
that's up for debate.

Moving on to Stefan's comments:
>
> If we see CGI as a kind of input that is not strictly regulated by
>> HTTP header formats (and that is an if), we should correct timezone
>> offset to GMT, but otherwise leave the time unchanged. It might be our
>> clock that has the issue. Meddling with it will not help anyone
>> debugging problems.
>>
>
> +1 (and I am currently of the opinion that CGI is not a strict HTTP input,
> as stated above).
>

Jacob and I have entirely different opinions on that, Stefan's comment
is very interesting. We have a clock on nearly every deployment of
httpd, but that doesn't mean it is correct. However the spec absolutely
demands that we do not send our untrusted "Date:" with a future
"Last-Modified:" value, it is externally inconsistent.


> If the value is unparseable, we should log it and suppress sending
>> outa "Last-Modified" completely. Also any "If-*" checking should
>> behave as if the header was not present.
>>
>
> +1.


Sounds right.


> The alternative is to expect the CGI to honor HTTP/1.1 header
>> semantics, pass values unchanged and let CGI and client run into
>> misunderstandings immediately.
>>
>
> Practically, I'm not super opposed to this alternative (but if we choose
> it, we should apply it consistently). If I put on spec-lawyer hat, the CGI
> RFC has this to say:
>
> [https://tools.ietf.org/html/rfc3875#section-6.2.1]
>
>    The server MUST make any appropriate modifications to the script's
>>    output to ensure that the response to the client complies with the
>>    response protocol version.
>>
>
> So this alternative is not my first choice. Invalid headers should really
> either be corrected (if the correction is obvious, safe, and helpful), or
> dropped entirely. Or the entire response should be 500'd, but we run into
> major compatibility breaks if we choose that route.
>

No, I agree that a 500 is not an option. Drop it, or fix it loudly in the
logs,
but we can't break existing deployments (which don't accept non-GMT
dates today, FWIW).


> And finally, from the latest patch from Luca:
>
> 2) Some comments have been added in the code to state clearly that
>> anynon compliant datetime strings will not be interpreted or re-formatted.
>>
>
> As stated above, this is not my first choice -- but I wouldn't oppose it
> if that's what the consensus comes to.
>

Right, the comments call out what the HTTP spec demanded.


>          else if (!ap_cstr_casecmp(w, "Last-Modified")) {
>> -            apr_time_t parsed_date = apr_date_parse_rfc(l);
>> +            apr_time_t parsed_date = apr_date_parse_http(l);
>>
>
> apr_date_parse_http() is not good enough; IIUC, it completely ignores
> timezones, which further corrupts non-GMT Last-Modified stamps. We either
> want strict parsing or actual correction, not something in the middle.


I'm happy to react to bad input following a parsable date string, e.g. any
non-"GMT" signifier, as entirely bad input worth emitting to the error log.
Let's help CGI authors find their bugs instead of generating unexpected
results, such as 5-8 hour wrong "Last Modified" dates in the US and
now() throughout Europe, owing to the time zone deltas.

Reply via email to