Your message dated Fri, 3 Feb 2006 16:42:44 +0100
with message-id <[EMAIL PROTECTED]>
and subject line Bug#148799: wget: wget does not treat *.shtml, *.phtml, etc. 
as html while using -r
has caused the attached Bug report to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere.  Please contact me immediately.)

Debian bug tracking system administrator
(administrator, Debian Bugs database)

--- Begin Message ---
>From [EMAIL PROTECTED] Sun Jun 02 03:40:12 2002
Return-path: <[EMAIL PROTECTED]>
Received: from gnome05.net.rol.ru [194.67.1.186] 
        by master.debian.org with esmtp (Exim 3.12 1 (Debian))
        id 17EQuK-00018N-00; Sun, 02 Jun 2002 03:40:12 -0500
Received: from ts1-a233.Irkutsk.dial.rol.ru ([195.239.202.233]:62992 "EHLO
        bearloga" ident: "NO-IDENT-SERVICE[2]" whoson: "-unregistered-"
        smtp-auth: <none> TLS-CIPHER: <none> TLS-PEER: <none>)
        by gnome05.sovam.com with ESMTP id <S2444740AbSFBIj5>;
        Sun, 2 Jun 2002 12:39:57 +0400
Received: from fedor by bearloga with local (Exim 3.35 #1 (Debian))
        id 17EQJW-0005DG-00; Sun, 02 Jun 2002 17:02:10 +0900
From:   Zuev Fedor <[EMAIL PROTECTED]>
Subject: wget: wget does not treat *.shtml, *.phtml, etc. as html while using -r
To:     [EMAIL PROTECTED]
X-Mailer: bug 3.3.10.1
Message-Id: <[EMAIL PROTECTED]>
Date:   Sun, 02 Jun 2002 17:02:10 +0900
Delivered-To: [EMAIL PROTECTED]

Package: wget
Version: 1.8.1-6
Severity: wishlist

In the case  I try to recursively download html-tree using -r -nc options,
and make it in several parts, wget does not treat already downloaded
files with .phtml *.shtml (anything except .htm[l]) suffixes as text/html.

Files with these suffixes are widely used nowadays, and abcence of proper
handling seriosly decreases usefulness of wget.

I made quick fix with it simply by adding some checks for most widely used
suffixes (see below). But, probably, it will be wise to add special 
configuration option to /etc/wgetrc or something similar.

diff -bBurN wget-1.8.1/src/http.c wget-1.8.1.new/src/http.c
--- wget-1.8.1/src/http.c       Fri Dec 14 00:46:56 2001
+++ wget-1.8.1.new/src/http.c   Sun Jun  2 16:50:40 2002
@@ -1448,7 +1448,12 @@
       /* #### Bogusness alert.  */
       /* If its suffix is "html" or "htm", assume text/html.  */
       if (((suf = suffix (*hstat.local_file)) != NULL)
-         && (!strcmp (suf, "html") || !strcmp (suf, "htm")))
+         && (!strcmp (suf, "html") ||
+             !strcmp (suf, "shtml") ||
+             !strcmp (suf, "htm") ||
+             !strcmp (suf, "phtml") ||
+             !strcmp (suf, "asp") ||
+             !strcmp (suf, "php")))
        *dt |= TEXTHTML;

       FREE_MAYBE (dummy);

-- System Information
Debian Release: 3.0
Kernel Version: Linux bearloga 2.4.17 #9 óÂÔ äÅË 29 20:49:41 IRKT 2001 i586 
unknown

Versions of the packages wget depends on:
ii  libc6          2.2.5-3        GNU C Library: Shared libraries and Timezone

--- Begin /etc/wgetrc (modified conffile)
waitretry = 100

--- End /etc/wgetrc

--- Begin /etc/wgetrc (modified conffile)
waitretry = 100

--- End /etc/wgetrc


--- End Message ---
--- Begin Message ---
>From [EMAIL PROTECTED] Fri Feb 03 07:43:17 2006
Return-path: <[EMAIL PROTECTED]>
Received: from mail.gmx.de ([213.165.64.21] helo=mail.gmx.net)
        by spohr.debian.org with smtp (Exim 4.50)
        id 1F535h-0006if-4U
        for [EMAIL PROTECTED]; Fri, 03 Feb 2006 07:43:17 -0800
Received: (qmail invoked by alias); 03 Feb 2006 15:42:46 -0000
Received: from dslb-084-063-030-013.pools.arcor-ip.net (EHLO colt.pezone.net) 
[84.63.30.13]
  by mail.gmx.net (mp030) with SMTP; 03 Feb 2006 16:42:46 +0100
X-Authenticated: #495269
From: Peter Eisentraut <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: Re: Bug#148799: wget: wget does not treat *.shtml, *.phtml, etc. as 
html while using -r
Date: Fri, 3 Feb 2006 16:42:44 +0100
User-Agent: KMail/1.8.3
MIME-Version: 1.0
Content-Type: text/plain;
  charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <[EMAIL PROTECTED]>
X-Y-GMX-Trusted: 0
X-Spam-Checker-Version: SpamAssassin 2.60-bugs.debian.org_2005_01_02 
        (1.212-2003-09-23-exp) on spohr.debian.org
X-Spam-Level: 
X-Spam-Status: No, hits=-4.5 required=4.0 tests=BAYES_10,HAS_BUG_NUMBER 
        autolearn=no version=2.60-bugs.debian.org_2005_01_02

Version: 1.10.2-1

This is fixed now.

--- End Message ---

Reply via email to