2016-08-01 21:13 GMT+02:00 Jacob Champion <champio...@gmail.com> > > > As stated above, this is not my first choice -- but I wouldn't oppose it > if that's what the consensus comes to. > > else if (!ap_cstr_casecmp(w, "Last-Modified")) { >> - apr_time_t parsed_date = apr_date_parse_rfc(l); >> + apr_time_t parsed_date = apr_date_parse_http(l); >> > > apr_date_parse_http() is not good enough; IIUC, it completely ignores > timezones, which further corrupts non-GMT Last-Modified stamps. We either > want strict parsing or actual correction, not something in the middle.
You are right, but this is the current behavior, this is why I used apr_date_parse_http. My idea was to keep what we have at the moment, adding a clear log that states the inconsistency found by httpd and why it has been corrected. In this case it would simply mean that a value considered "in the future" from GMT would be logged and corrected accordingly. This would still leave out other wrong use cases like a datetime string in the US/West timezone "in the future" if converted in GMT, but to solve it we'd need to interpret the Last-Modified header value with the apr_date_parse_rfc function and afaiu we still need to reach an agreement :) So I am really in favor of using apr_date_parse_rfc but I wanted to come up with a possible to solution to satisfy everybody and help our users with some clue. Example about the wrong use case mentioned by me for everybody (it helped me a lot so it might also be useful to other people reading): =================== HTTP/1.1 200 OK Date: Tue, 02 Aug 2016 07:54:39 GMT Server: Apache/2.5.0-dev (Unix) Last-Modified: Tue, 02 Aug 2016 00:54:39 GMT Content-Type: text/html; charset=UTF-8 Last-Modified sent: Tue, 02 Aug 2016 00:54:39 -0700 Now in GMT: Tue, 02 Aug 2016 07:54:39 +0000 =================== As you can see, the FCGI script sets a Last-Modified header value now() in the "America/Los_Angeles" timezone, but it gets interpreted by httpd as a GMT datestring "in the past". So for example "now() + 2 hours" in the "America/Los_Angeles" timezone would still get interpreted as "in the past" by httpd, avoiding any datetime correction to GMT now(). Same example with the code change that we have originally proposed: =================== HTTP/1.1 200 OK Date: Tue, 02 Aug 2016 08:08:14 GMT Server: Apache/2.5.0-dev (Unix) Last-Modified: Tue, 02 Aug 2016 08:08:14 GMT Content-Type: text/html; charset=UTF-8 Last-Modified sent: Tue, 02 Aug 2016 01:08:14 -0700 Now in GMT: Tue, 02 Aug 2016 08:08:14 +0000 =================== The debate is all around, as far as I have understood, to if httpd should be able to do the above "interpretation" or not. 2016-08-01 23:13 GMT+02:00 Jacob Champion <champio...@gmail.com>: > On 08/01/2016 01:35 PM, William A Rowe Jr wrote: > >> So this alternative is not my first choice. Invalid headers should >> really either be corrected (if the correction is obvious, safe, and >> helpful), or dropped entirely. Or the entire response should be >> 500'd, but we run into major compatibility breaks if we choose that >> route. >> >> No, I agree that a 500 is not an option. Drop it, or fix it loudly in >> the logs, >> but we can't break existing deployments (which don't accept non-GMT >> dates today, FWIW). >> > > Based on conversations with Luca, existing deployments do "accept" non-GMT > dates, silently corrupting the value, due to the use of the *_parse_http() > function. > > I'm happy to react to bad input following a parsable date string, e.g. any >> non-"GMT" signifier, as entirely bad input worth emitting to the error >> log. >> Let's help CGI authors find their bugs instead of generating unexpected >> results, such as 5-8 hour wrong "Last Modified" dates in the US and >> now() throughout Europe, owing to the time zone deltas. >> > > Cool. I'm not opposed to loudly dropping non-GMT timestamps (though I > still prefer fixing them up). Wouldn't this introduce a breaking change? I know that everybody is supposed to look into the changelog before upgrading httpd but as we often see this is not the case and we could trigger nasty issues in various installations imho. Compared to this solution I'd much prefer to go for the proposed one, invasive but less disruptive :) As Jacob said, more opinions are very welcome! One general comment: I really like discussions but in this case I feel that we would have solved the issue in 20 minutes talking in person :) So maybe a conversation on IRC whenever everybody has a bit of time could help reaching a final consensus? Luca