2016-08-01 21:13 GMT+02:00 Jacob Champion <champio...@gmail.com>
>
>
> As stated above, this is not my first choice -- but I wouldn't oppose it
> if that's what the consensus comes to.
>
>          else if (!ap_cstr_casecmp(w, "Last-Modified")) {
>> -            apr_time_t parsed_date = apr_date_parse_rfc(l);
>> +            apr_time_t parsed_date = apr_date_parse_http(l);
>>
>
> apr_date_parse_http() is not good enough; IIUC, it completely ignores
> timezones, which further corrupts non-GMT Last-Modified stamps. We either
> want strict parsing or actual correction, not something in the middle.


You are right, but this is the current behavior, this is why I used
apr_date_parse_http. My idea was to keep what we have at the moment, adding
a clear log that states the inconsistency found by httpd and why it has
been corrected. In this case it would simply mean that a value considered
"in the future" from GMT would be logged and corrected accordingly. This
would still leave out other wrong use cases like a datetime string in the
US/West timezone "in the future" if converted in GMT, but to solve it we'd
need to interpret the Last-Modified header value with the
apr_date_parse_rfc function and afaiu we still need to reach an agreement :)

So I am really in favor of using apr_date_parse_rfc but I wanted to come up
with a possible to solution to satisfy everybody and help our users with
some clue.

Example about the wrong use case mentioned by me for everybody (it helped
me a lot so it might also be useful to other people reading):

===================
HTTP/1.1 200 OK
Date: Tue, 02 Aug 2016 07:54:39 GMT
Server: Apache/2.5.0-dev (Unix)
Last-Modified: Tue, 02 Aug 2016 00:54:39 GMT
Content-Type: text/html; charset=UTF-8

Last-Modified sent: Tue, 02 Aug 2016 00:54:39 -0700
Now in GMT: Tue, 02 Aug 2016 07:54:39 +0000
===================

As you can see, the FCGI script sets a Last-Modified header value now() in
the "America/Los_Angeles" timezone, but it gets interpreted by httpd as a
GMT datestring "in the past". So for example "now() + 2 hours" in the
"America/Los_Angeles" timezone would still get interpreted as "in the past"
by httpd, avoiding any datetime correction to GMT now().

Same example with the code change that we have originally proposed:

===================
HTTP/1.1 200 OK
Date: Tue, 02 Aug 2016 08:08:14 GMT
Server: Apache/2.5.0-dev (Unix)
Last-Modified: Tue, 02 Aug 2016 08:08:14 GMT
Content-Type: text/html; charset=UTF-8

Last-Modified sent: Tue, 02 Aug 2016 01:08:14 -0700
Now in GMT: Tue, 02 Aug 2016 08:08:14 +0000
===================

The debate is all around, as far as I have understood, to if httpd should
be able to do the above "interpretation" or not.


2016-08-01 23:13 GMT+02:00 Jacob Champion <champio...@gmail.com>:

> On 08/01/2016 01:35 PM, William A Rowe Jr wrote:
>
>>     So this alternative is not my first choice. Invalid headers should
>>     really either be corrected (if the correction is obvious, safe, and
>>     helpful), or dropped entirely. Or the entire response should be
>>     500'd, but we run into major compatibility breaks if we choose that
>>     route.
>>
>> No, I agree that a 500 is not an option. Drop it, or fix it loudly in
>> the logs,
>> but we can't break existing deployments (which don't accept non-GMT
>> dates today, FWIW).
>>
>
> Based on conversations with Luca, existing deployments do "accept" non-GMT
> dates, silently corrupting the value, due to the use of the *_parse_http()
> function.
>
> I'm happy to react to bad input following a parsable date string, e.g. any
>> non-"GMT" signifier, as entirely bad input worth emitting to the error
>> log.
>> Let's help CGI authors find their bugs instead of generating unexpected
>> results, such as 5-8 hour wrong "Last Modified" dates in the US and
>> now() throughout Europe, owing to the time zone deltas.
>>
>
> Cool. I'm not opposed to loudly dropping non-GMT timestamps (though I
> still prefer fixing them up).


Wouldn't this introduce a breaking change? I know that everybody is
supposed to look into the changelog before upgrading httpd but as we often
see this is not the case and we could trigger nasty issues in various
installations imho. Compared to this solution I'd much prefer to go for the
proposed one, invasive but less disruptive :)

As Jacob said, more opinions are very welcome!

One general comment: I really like discussions but in this case I feel that
we would have solved the issue in 20 minutes talking in person :)
So maybe a conversation on IRC whenever everybody has a bit of time could
help reaching a final consensus?

Luca

Reply via email to