Many thanks Gilles, this patch works like a champ. The bad date format is
ignored and digging can and does happily continue. I will inform the Jrun
folks (which otherwise have a nice product).

Thanks again,
Denis


At 11:06 AM 2/3/99 -0600, Gilles Detillieux wrote:
>
>According to Geoff Hutchison:
>> > Header line: HTTP/1.1 200 OK
>> > Header line: Server: Microsoft-IIS/4.0
>> > Header line: Date: Wed, 03 Feb 1999 01:12:44 GMT
>> > Header line: Content-Type: text/html
>> > Header line: Cache-Control: no-cache="set-cookie,set-cookie2"
>> > Header line: Last-Modified: 27 Jan 1999 01:12:44 GMT
>> 
>> This last line is invalid. Last-modified headers should have a format like:
>> Header line: Last-Modified: Mon, 2 Feb 1999 01:12:44 GMT
>> 
>> See http://www.pmg.lcs.mit.edu/cgi-bin/rfc/view?2068
>> 
>> So you say "I don't care if it's invalid, ht://Dig should be able to keep
>> going." Fair enough. But I'm beginning to worry about the complexity of
that
>> section of code if people keep finding non-compliant servers. There's a
>> reason for RFCs...
>> 
>> What should we do, decide that we'll give the current time to documents
from
>> servers that return poorly-formatted dates? That doesn't sound like a good
>> solution to me.
>
>Well, we already ignore bad weekdays, so why not allow missing weekdays
>too.  Here's a patch to htdig-3.1.0dev-013199 to make getdate a bit
>more fault-tolerant.
>
>I'd like people to try it out to make sure it works, especially on
>systems that have had problems with mystrptime/strftime in the past.
>Note that this patch won't work for 3.1.0b4, because of other changes to
>getdate() since that release.  I'll post a patch for 3.1.0b4 separately.
>Please grab the one that is applicable to your source, or grab the latest
>snapshot and add this patch, and please let me know if this fixes the
>problems you've had, or breaks anything.  I've walked through the code
>quite carefully, and tested it on my server, and I'm quite confident
>it works, but independent confirmation would be a plus, especially as
>we're very close to final release.
>
>
>--- htdig/Document.cc.datebug  Tue Jan 26 18:27:21 1999
>+++ htdig/Document.cc  Wed Feb  3 10:39:20 1999
>@@ -191,9 +191,9 @@
> time_t
> Document::getdate(char *datestring)
> {
>-    String    d = datestring;
>     struct tm   tm;
>     time_t      ret;    
>+    char        *s;    
> 
>     //
>     // Two possible time designations:
>@@ -203,23 +203,29 @@
>     //
>     // We strip off the weekday before sending to strptime
>     // because some servers send invalid weekdays!
>+    // (Some don't even send a weekday, but we'll be flexible...)
>  
>-    int weekday_index = d.indexOf(',');
>-    if (weekday_index > 3)
>-        mystrptime(d.sub(weekday_index + 2), "%d-%b-%y %T", &tm);
>+    s = strchr(datestring, ',');
>+    if (s)
>+        s++;
>     else
>-      mystrptime(d.sub(weekday_index + 2), "%d %b %Y %T", &tm);
>-
>-    if (&tm != NULL) // We hope it isn't NULL!
>+        s = datestring;
>+    while (isspace(*s))
>+        s++;
>+    if (strchr(s, '-') && mystrptime(s, "%d-%b-%y %T", &tm) ||
>+            mystrptime(s, "%d %b %Y %T", &tm))
>       {
>+      // correct for mystrptime, if %Y format saw only a 2 digit year
>       if (tm.tm_year < 0)
>         tm.tm_year += 1900;
>       
>       if (debug > 2)
>         {
>-          cout << "Translated " << d << " to ";
>+          cout << "Translated " << datestring << " to ";
>           char        buffer[100];
>-          strftime(buffer, sizeof(buffer), "%a, %d %b %Y %T", &tm);
>+          // Leave out %a for weekday, because we don't set it anymore...
>+          //strftime(buffer, sizeof(buffer), "%a, %d %b %Y %T", &tm);
>+          strftime(buffer, sizeof(buffer), "%d %b %Y %T", &tm);
>           cout << buffer << " (" << tm.tm_year << ")" << endl;
>         }
> #if HAVE_TIMEGM
>@@ -230,6 +236,11 @@
>       }
>     else
>       {
>+      if (debug > 2)
>+        {
>+          cout << "Cannot translate " << datestring <<
>+                    ", using current time" << endl;
>+        }
>       ret = time(0); // This isn't the best, but it works. *fix*
>       }
>     if (debug > 2)
>
>-- 
>Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
>Spinal Cord Research Centre       WWW:
http://www.scrc.umanitoba.ca/~grdetil
>Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
>Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
>------------------------------------
>To unsubscribe from the htdig mailing list, send a message to
>[EMAIL PROTECTED] containing the single word "unsubscribe" in
>the SUBJECT of the message.
> 

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to