Shantonu Sen <[EMAIL PROTECTED]> writes: > | I've taken out military zones, and this appears to have fixed up some > | of the errant parsing when compared to 1.0.4. Dan, can you do > | another diff to check this (I just checked in the changes). Also, you > | will still see many differing lines because dtimep no longer treats > | timezone-unqualified mails as having originated in the current zone. > | Those emails will show up as GMT, which I think is totally reasonable. Well, the last time I ran the ad-hoc test, it was against my inbox, which of course has changed since then, so this isn't all that meaningful, but now 1.0.4 and 1.0.4+dev produce the same output for it. Therefore this time I ran the test against all my mail folders. There are still lots of differences between 1.0.4 and 1.0.4+dev. You mention in ChangeLog: * Took out bad time textual time zones like BST and JST. I found them online somewhere, but am not sure if they're correct. Were they causing a problem or did you just remove them because you were unsure of them? The first difference in my folders is a mail from someone in Japan with "JST" in the date. Now, it looks like what 1.0.4 printed for that date was incorrect as well (it comes up with an offset of +02, which doesn't seem to make sense), but 1.0.4+dev proclaims it to be GMT. The next diff I come across is an "NZS" timezone, which previously printed out as +01 (again, seems wrong) but now says GMT. This one's not a huge deal (the new output is right, I guess, just unfriendly), but "Thu, 25 May 2000 20:19:10 -800" used to come out as "20:19PST" in the scan.time output, but is now "20:19-08". Okay, here's another one that was wrong before and is wrong in a different way now. "Mon, 3 Jul 2000 12:40:54 CEST" previously printed as "12:40EST" (wonder if it really thought it was U.S. Eastern Standard Time or if the initial "C" was getting cropped due to an erroneous assumption that all textual timezones were 3 characters), and now prints as "12:40GMT". Here's an interesting one. "Wed, 26 Jul 2000 09:52:40 +1000 (EST)" previously printed as "09:52+10" but is now "09:52EDT". Okay, the next several differences are the "no timezone -> GMT instead of local timezone" change you already mentioned. I know that logically, interpreting no zone as GMT makes more sense, but I wonder which type of timezoneless date comes up more _often_. It may have been the way it was for a good reason (e.g. common old versions of sendmail that gave no timezone on local mails or something). The next differences seem to be a good change. "JST +900" was previous output as that +02 again, but is now +09. Wow, here's a date format I don't recall seeing before. An automated email from eBay with the date "Wed, 01 Dec 1999 20:55:20 Pacific Standard Time" previously was incorrectly "20:55EST" and is now incorrectly "20:55GMT". Dunno if that date format is RFC-legal, but at least it's unambiguous... The next one may be an OK change. "Wed, 29 Mar 2000 15:11:23 -0600 (EST)" (the wrong offset for EST, no?) previously printed as "15:11CST" (right per the offset?) but now blindly trusts the "EST", which I guess is okay. However, I've heard that there are duplicate ASCII timezones, so perhaps we ought to trust the numeric offset, if present, over the timezone strings. Okay, this one's way bogus. "Mon, 3 Apr 2000 21:11:21 +0000 (GMT)" was previously correctly "21:35-00" and is now "21:35BST". Here's another exotic one (Ankorage Daylight Time perhaps??). "Tue, 11 Apr 2000 04:58:07 AKDT" was "04:58+07" (correct?) but is now "04:58GMT". Here's an interesting one. A mail from someone in Australia with the date "Thu, 25 May 2000 16:35:02 +0930 (CST)" (a duplicate "CST", I assume?) previously was "16:35+09" (best we can do in 8 characters' width) but is now "16:35CDT". Here's one that probably breaks the RFCs but was previously interpreted correctly. "28 Jul 2000 11:4:6 GMT" was previously "11:04GMT" but is now "00:00GMT". Okay, here's another timezone string the Australians apparently co-opted. "Wed, 6 Sep 2000 08:52:50 +1100 (EST)" was previously "08:52+11" but is now "08:52EDT". Indeed it appears that we needs to pay attention to the numeric offset over the textual one, if both are present. I think that's all the differences in my folders. There were a lot more instances, but I think they were all duplicates of these cases. As far as finding the right offsets to use for timezones that nmh doesn't grok (like NZS), one possible reference is: http://www.bsdi.com/date ----------------------------------------------------------------------- Dan Harkless | To prevent SPAM contamination, please [EMAIL PROTECTED] | do not post this private email address SpeedGate Communications, Inc. | to the USENET or WWW. Thank you.
