File Rejection (-R) does not appear to be working, etc.
The following does not do what I expected it to do:
/usr/sfw/bin/wget -R .db,LCK,[Ll][Oo][Gg],[Cc]opy?[Oo]f -P
/d0/denStageSvrUC -nH -m -X /apps,/.DAV/_notes
ftp://gwheeler:password@myhost.abc.org/
It does not reject files such as log
Hi,
I have a webpage
that has some html textthat has been pasted from MS Word and the quote
char ' is a special "type", ie not the ascii one. This char displays fine in
IE/Firefox. However, when I spider the page with Wget (windows) it encodes this
character in a funny way e.g. areaĆ¢(tm)s =
Wget shouldn't alter the page contents, except for converted links.
Is the funny character in places which Wget should know about
(e.g. URLs in links) or in the page text? Could you page a minimal
excerpt from the page, before and after garbling done by Wget?
Alternately, could you post a URL
Hi,
Thanks for the reply. It is the page text that is the problem.
When I started to investigate it further I found that it actually only
happens when the page being wgot is a .aspx (.net asp) file.
I made 3 identical files (as below), one with .html ext, 1 with .aspx ext
and one with .zzz
I'm not sure what causes this problem, but I suspect it does not come
from Wget doing something wrong. That Notepad opens the file
correctly is indicative enough.
Maybe those browsers don't understand UTF-8 (or other) encoding of
Unicode when the file is opened on-disk?
On Wed, 30 Mar 2005, Hrvoje Niksic wrote:
Behdad Esfahbod [EMAIL PROTECTED] writes:
Well, sorry if it's all nonsense now: Last year I sent the
following mail, and got a reply confirming this bug and that it
may be changed to use pid instead of a serial in log filename.
Recently I was
Sorry for the noise, I just noticed: When testing with:
for x in `seq 1 100`; do wget -b http://behdad.org/; done
If I use the 1.8.2 version, although I get 100 different log
files, but get only 14 index.html files. With the CVS, I see in
the log that it's trying to find a nonexistence file: