2016-08-04 11:52 GMT+02:00 Luca Toscano <toscano.l...@gmail.com>: > > > 2016-08-02 10:17 GMT+02:00 Luca Toscano <toscano.l...@gmail.com>: > >> >> 2016-08-01 21:13 GMT+02:00 Jacob Champion <champio...@gmail.com> >>> >>> >>> As stated above, this is not my first choice -- but I wouldn't oppose it >>> if that's what the consensus comes to. >>> >>> else if (!ap_cstr_casecmp(w, "Last-Modified")) { >>>> - apr_time_t parsed_date = apr_date_parse_rfc(l); >>>> + apr_time_t parsed_date = apr_date_parse_http(l); >>>> >>> >>> apr_date_parse_http() is not good enough; IIUC, it completely ignores >>> timezones, which further corrupts non-GMT Last-Modified stamps. We either >>> want strict parsing or actual correction, not something in the middle. >> >> >> You are right, but this is the current behavior, this is why I used >> apr_date_parse_http. My idea was to keep what we have at the moment, adding >> a clear log that states the inconsistency found by httpd and why it has >> been corrected. In this case it would simply mean that a value considered >> "in the future" from GMT would be logged and corrected accordingly. This >> would still leave out other wrong use cases like a datetime string in the >> US/West timezone "in the future" if converted in GMT, but to solve it we'd >> need to interpret the Last-Modified header value with the >> apr_date_parse_rfc function and afaiu we still need to reach an agreement :) >> >> So I am really in favor of using apr_date_parse_rfc but I wanted to come >> up with a possible to solution to satisfy everybody and help our users with >> some clue. >> >> Example about the wrong use case mentioned by me for everybody (it helped >> me a lot so it might also be useful to other people reading): >> >> =================== >> HTTP/1.1 200 OK >> Date: Tue, 02 Aug 2016 07:54:39 GMT >> Server: Apache/2.5.0-dev (Unix) >> Last-Modified: Tue, 02 Aug 2016 00:54:39 GMT >> Content-Type: text/html; charset=UTF-8 >> >> Last-Modified sent: Tue, 02 Aug 2016 00:54:39 -0700 >> Now in GMT: Tue, 02 Aug 2016 07:54:39 +0000 >> =================== >> >> As you can see, the FCGI script sets a Last-Modified header value now() >> in the "America/Los_Angeles" timezone, but it gets interpreted by httpd as >> a GMT datestring "in the past". So for example "now() + 2 hours" in the >> "America/Los_Angeles" timezone would still get interpreted as "in the past" >> by httpd, avoiding any datetime correction to GMT now(). >> >> Same example with the code change that we have originally proposed: >> >> =================== >> HTTP/1.1 200 OK >> Date: Tue, 02 Aug 2016 08:08:14 GMT >> Server: Apache/2.5.0-dev (Unix) >> Last-Modified: Tue, 02 Aug 2016 08:08:14 GMT >> Content-Type: text/html; charset=UTF-8 >> >> Last-Modified sent: Tue, 02 Aug 2016 01:08:14 -0700 >> Now in GMT: Tue, 02 Aug 2016 08:08:14 +0000 >> =================== >> >> The debate is all around, as far as I have understood, to if httpd should >> be able to do the above "interpretation" or not. >> >> >> 2016-08-01 23:13 GMT+02:00 Jacob Champion <champio...@gmail.com>: >> >>> On 08/01/2016 01:35 PM, William A Rowe Jr wrote: >>> >>>> So this alternative is not my first choice. Invalid headers should >>>> really either be corrected (if the correction is obvious, safe, and >>>> helpful), or dropped entirely. Or the entire response should be >>>> 500'd, but we run into major compatibility breaks if we choose that >>>> route. >>>> >>>> No, I agree that a 500 is not an option. Drop it, or fix it loudly in >>>> the logs, >>>> but we can't break existing deployments (which don't accept non-GMT >>>> dates today, FWIW). >>>> >>> >>> Based on conversations with Luca, existing deployments do "accept" >>> non-GMT dates, silently corrupting the value, due to the use of the >>> *_parse_http() function. >>> >>> I'm happy to react to bad input following a parsable date string, e.g. >>>> any >>>> non-"GMT" signifier, as entirely bad input worth emitting to the error >>>> log. >>>> Let's help CGI authors find their bugs instead of generating unexpected >>>> results, such as 5-8 hour wrong "Last Modified" dates in the US and >>>> now() throughout Europe, owing to the time zone deltas. >>>> >>> >>> Cool. I'm not opposed to loudly dropping non-GMT timestamps (though I >>> still prefer fixing them up). >> >> >> Wouldn't this introduce a breaking change? I know that everybody is >> supposed to look into the changelog before upgrading httpd but as we often >> see this is not the case and we could trigger nasty issues in various >> installations imho. Compared to this solution I'd much prefer to go for the >> proposed one, invasive but less disruptive :) >> >> As Jacob said, more opinions are very welcome! >> >> One general comment: I really like discussions but in this case I feel >> that we would have solved the issue in 20 minutes talking in person :) >> So maybe a conversation on IRC whenever everybody has a bit of time could >> help reaching a final consensus? >> > > > Trying with a new patch: http://apaste.info/xHz > > I added two separate logs, one for non-GMT header values and the other one > for datetime strings considered in the future, restoring > apr_date_parse_http instead of apr_date_parse_rfc. > > The example in my previous email would lead to the first error log, > meanwhile a Last-Modified header in the Europe/Paris timezone would lead to > both. > > Hope that this could be a good compromise (modulo changes to wording > and/or log format that are very welcome), otherwise I am starting to run > out of ideas :) > > Comments about the last patch? I would be inclined to proceed with something like http://apaste.info/xHz (wording and logging can be changed of course) to unblock the discussion, possibly talking about other ideas and next steps in the future.
Regards, Luca