Hi -

I'm invoking wget 1.9.1 with the following options, among others:

--input-file _filename_
--restrict-file-names='windows'
--directory-prefix=/www/htdocs/newfolder1/newfolder2
--convert-links
--html-extension

but not recursion. 

The reason I'm using an input file instead of recursion is this documentation 
about --accept _acclist_:

"Note that these two options do not affect the downloading of HTML files; Wget 
must load all the HTMLs to know where to go at all--recursive retrieval would 
make no sense otherwise."

Well, not quite.  I want to retrieve all pages named
http://my.source.site/oldfolder/abc_pages.asp?id=nnnnn and
http://my.source.site/oldfolder/def_pages.asp?id=nnnnn and
http://my.source.site/oldfolder/ghi_pages.asp?id=nnnnn and
but not pages named
http://my.source.site/oldfolder/jkl_pages.asp?id=nnnnn or
http://my.source.site/oldfolder/mno_pages.asp?id=nnnnn or
http://my.source.site/oldfolder/pqr_pages.asp?id=nnnnn . 

(Where nnnnn is the 5-digit number corresponding to the actual content.)

That is, they are all in the same directory, with different whatever.asp names. 

What's happening is that the pages in my input list are correctly getting 
copied to 

http://my.target.site/newfolder1/newfolder2/[EMAIL PROTECTED] etc

but the links in the pages are untranslated from their original

/oldfolder/def_pages?id=nnnnn

instead of being translated to a working

[EMAIL PROTECTED]
or   
/newfolder1/newfolder2/[EMAIL PROTECTED]

and links to unwanted pages such as
/oldfolder/jkl_pages.asp?id=nnnnn 
are not being translated to
http://my.source.site/oldfolder/jkl_pages.asp?id=nnnnn
in the files on my new site. 

I'm guessing that --convert-links will only work with recursion, and that's why 
the links also aren't being fixed for the .html extension or the Windows file 
names.  Is there a way to get some of the HTML files and not others when they 
are in the same directory, but still get the links fully translated? Or will I 
need to post-edit my new files outside of wget to fix the links?

Note: The target site is on a Un*x box, but I have to be able to 
upload/download from a PC.

Thanks in advance,
Charles "Chas" Belov

Reply via email to