Can wget be used to produce a recursive file listing of every file on an FTP server?

2008-03-18 Thread Chaim Krause
I went to #wget and got the following... -ChanServ- [#wget] No one around to answer your question? Try the mailing list at [EMAIL PROTECTED] No subscription necessary, just ask to be Cc'd. So I am posting this and asking to be Cc'd. Short version: Can wget be used to produce a recursive

Error with --recursive = zero exit status?

2007-11-20 Thread Patrick
GNU Wget 1.10.2, and it's more an inconsitency than a bug. Basically, if something goes wrong with the network connection while downloading recursively, wget will time-out as expected, but without reporting any kind of an error. No non-zero exit status, at least. So when I put this in a script;

Re: Error with --recursive = zero exit status?

2007-11-20 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Micah Cowan wrote: Patrick wrote: GNU Wget 1.10.2, and it's more an inconsitency than a bug. I definitely consider it a bug. Basically, if something goes wrong with the network connection while downloading recursively, wget will time-out as

Re: Error with --recursive = zero exit status?

2007-11-20 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Patrick wrote: GNU Wget 1.10.2, and it's more an inconsitency than a bug. I definitely consider it a bug. Basically, if something goes wrong with the network connection while downloading recursively, wget will time-out as expected, but without

Recursive downloading and post

2007-10-22 Thread Stuart Moore
to be sending post requests designed for one page to another one) Is there any way to get wget to only use the post data for the first file downloaded? I couldn't find any in the documentation - in fact there seems to be nothing in the documentation regarding the interaction of recursive downloading

Re: Recursive downloading and post

2007-10-22 Thread Micah Cowan
to be nothing in the documentation regarding the interaction of recursive downloading with post data. It would be great to see the current behaviour documented somewhere. Alternatively, if anyone can suggest any workarounds, that'd be much appreciated. I need to convert the links, so just

RE: Recursive downloading and post

2007-10-22 Thread Tony Lewis
Micah Cowan wrote Stuart Moore wrote: Is there any way to get wget to only use the post data for the first file downloaded? Unfortunately, I'm not sure I can offer much help. AFAICT, --post-file and --post-data weren't really designed for use with recursive downloading. Perhaps

[fwd] Wget Bug: recursive get from ftp with a port in the url fails

2007-09-17 Thread Hrvoje Niksic
---BeginMessage--- Hi,I am using wget 1.10.2 in Windows 2003.And the same problem like Cantara. The file system is NTFS. Well I find my problem is, I wrote the command in schedule tasks like this: wget -N -i D:\virus.update\scripts\kavurl.txt -r -nH -P d:\virus.update\kaspersky well, after

Re: [fwd] Wget Bug: recursive get from ftp with a port in the url fails

2007-09-17 Thread Micah Cowan
Hrvoje Niksic wrote: Subject: Re: Wget Bug: recursive get from ftp with a port in the url fails From: baalchina [EMAIL PROTECTED] Date: Mon, 17 Sep 2007 19:56:20 +0800 To: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Message-ID: [EMAIL PROTECTED] MIME-Version: 1.0 Content-Type

Re: --spider requires --recursive

2007-08-18 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Matthias Vill wrote: Should --spider imply --recursive? I guess many people expect it to behave that way (and therefore I think it is a good idea that the output complains on not using --recursive, but still some may want to have a single-file

Re: --spider requires --recursive

2007-08-18 Thread Josh Williams
On 8/18/07, Micah Cowan [EMAIL PROTECTED] wrote: I'm not convinced. To me, the name spider implies recursion, and it's counter-intuitive for it not to. As to wasted functionality, what's wrong with -O /dev/null (or NUL or whatever) for simply checking existence? I see his point. The

--spider requires --recursive

2007-08-17 Thread Josh Williams
Is there any particular reason the --spider option requires --recursive? As it is now, we run into the following error if we omit --recursive: [EMAIL PROTECTED]:~/cprojects/wget/src$ ./wget http://www.google.com --spider Spider mode enabled. Check if remote file exists. --00:37:21-- http

Re: --spider requires --recursive

2007-08-17 Thread Matthias Vill
it is interested in the links within Should --spider imply --recursive? I guess many people expect it to behave that way (and therefore I think it is a good idea that the output complains on not using --recursive, but still some may want to have a single-file-checking-option. So we would waste

Recursive function does not work with -O

2007-06-08 Thread Gekko
wget -r http://murga-linux.com/puppy/index.php?f=11 -O - This returns the first page it downloads only, and does not continue to download the other links, while omitting the -O - allows the downloading to work. # wget --version GNU Wget 1.10.2 # uname -a Linux [domain removed] 2.6.18.1 #1 Thu

Re: Recursive function does not work with -O

2007-06-08 Thread Steven M. Schweda
From: Gekko [...] returns the first page it downloads only, and does not continue to download the other links, while omitting the -O - allows the downloading to work. That's right. In recursive HTTP operation, wget expects to read its own output files to find the links to follow. It's

How to continue a recursive dowloading process without checking update?

2007-05-28 Thread hoho hoho
Hi everybody: This is my first time to do this. I want to download a site to my computer.And the situation is described as below 1 My Internet connection is time limited.I can't get on line at night. 2 Even though the power supply will be cut off at night.So I can't keep my computer

Bug using recursive get and stdout

2007-04-17 Thread Jonathan A. Zdziarski
Greetings, Stumbled across a bug yesterday reproduced in both v1.8.2 and 1.10.2. Apparently, recursive get tries to open the file for reading after downloading, to download subsequent files. Problem is, when used with -O - to deliver to stdout, it cannot open that file, so you get

Re: Bug using recursive get and stdout

2007-04-17 Thread Steven M. Schweda
A quick search at http://www.mail-archive.com/wget@sunsite.dk/; for -O found: http://www.mail-archive.com/wget@sunsite.dk/msg08746.html http://www.mail-archive.com/wget@sunsite.dk/msg08748.html The way -O is implemented, there are all kinds of things which are incompatible with

wget recursive too broad?

2007-02-15 Thread Jesse Peterson
(Please cc me/reply all) No matter what I try (including specifically limiting domains with -H and -D) wget crawls sites that are not specified on the command line. For example. A simple: % wget -r -l 1 http://www.nytimes.com [stop after 2 mins, and then] % ls -1 homedelivery.nytimes.com/

Feature Request: Prefiltering (applicable to recursive gets)

2006-12-03 Thread Thejaka Maldeniya
It seems pointless to download ALL html content under some circumstances... esp. if all or most pages contain php links, like, Apache file listings (i.e. ?N=D, ?M=A, etc.). Why not add an option like --rreject --pre-reject or maybe --pre-filter, so we can specify which types of links to

Recursive wget

2006-10-29 Thread Shaun Jackman
The following web site contains links to a number of music (MP3) files, but wget -r does not download the linked files. I've read the manual page and tried the options -r and -m with -v to no avail. This bug is more likely a bug in the documentation than in the program. wget does not give any

Re: Recursive wget

2006-10-29 Thread Fi Dot
On 10/29/06, Shaun Jackman [EMAIL PROTECTED] wrote: The following web site contains links to a number of music (MP3) files, but wget -r does not download the linked files. I've read the manual page and tried the options -r and -m with -v to no avail. This bug is more likely a bug in the

Recursive download should allow pruning based on filename pattern

2006-10-06 Thread Axel Boldt
It is my understanding that if an HTML file matches a reject pattern during recursive download, the file is still downloaded, parsed for its links and then deleted. The manual hints at this behavior by saying Note that these two options [-R and -A] do not affect the downloading of html files

recursive download

2006-05-19 Thread Ajar Taknev
Hi, I am trying to recursively download from an ftp site without success. I am behind a squid proxy and I have setup the .wgetrc correctly. When I do wget -r ftp:/ftp.somesite.com/dir it fetches ftp.somesite.com/dir/index.html and exits. It doesn't do a recursive download. When I do the same

Re: recursive download

2006-05-19 Thread Mauro Tortonesi
Ajar Taknev wrote: Hi, I am trying to recursively download from an ftp site without success. I am behind a squid proxy and I have setup the .wgetrc correctly. When I do wget -r ftp:/ftp.somesite.com/dir it fetches ftp.somesite.com/dir/index.html and exits. It doesn't do a recursive download

Re: fixed recursive ftp download over proxy and 1.10.3

2006-05-19 Thread Mauro Tortonesi
[EMAIL PROTECTED] wrote: Hi, I have been embarrassed with the ftp over http bug . for quite a while : 1.5 years. I was very happy to learn that someone had developped a patch. Happier to read that you would merge it shortly. Do you know when you will be able to publish this 1.10.3 release

Re: recursive download

2006-05-19 Thread Steven M. Schweda
From: Mauro Tortonesi [...] this is one of the pending bugs that will be fixed before the upcoming 1.11 release. At the risk of beating a dead horse yet again, is there any chance of getting the VMS changes into this upcoming 1.11 release?

fixed recursive ftp download over proxy and 1.10.3

2006-05-17 Thread amailp
Hi, I have been embarrassed with the ftp over http bug . for quite a while : 1.5 years. I was very happy to learn that someone had developped a patch. Happier to read that you would merge it shortly. Do you know when you will be able to publish this 1.10.3 release ? Thanks for your work on

Re: Wget Bug: recursive get from ftp with a port in the url fails

2006-04-13 Thread Hrvoje Niksic
Jesse Cantara [EMAIL PROTECTED] writes: A quick resolution to the problem is to use the -nH command line argument, so that wget doesn't attempt to create that particular directory. It appears as if the problem is with the creation of a directory with a ':' in the name, which I cannot do

Wget Bug: recursive get from ftp with a port in the url fails

2006-04-12 Thread Jesse Cantara
I've encountered a bug when trying to do a recursive get from an ftp site with a non-standard port defined in the url, such as ftp.somesite.com:1234.An example of the command I am typing is: wget -r ftp://user:[EMAIL PROTECTED]:4321/Directory/*Where Directory contains multiple subdirectories, all

Re: wget option (idea for recursive ftp/globbing)

2006-03-09 Thread Mauro Tortonesi
Tony Lewis wrote: Mauro Tortonesi wrote: i would like to read other users' opinion before deciding which course of action to take, though. Other users have suggested adding a command line option for -a two or three times in the past: - 2002-11-24: Steve Friedl [EMAIL PROTECTED]

Fwd: Recursive FTP Fail -- unsupported file type?

2006-03-08 Thread Jake Peavy
-- Forwarded message --From: Jake Peavy [EMAIL PROTECTED]Date: Mar 8, 2006 2:57 PM Subject: Recursive FTP Fail -- unsupported file type?To: [EMAIL PROTECTED] Hi all, I'm attempting to download some .bin files using wget, observe: [EMAIL PROTECTED] ~]$ wget ftp://user:[EMAIL

Re: Fwd: Recursive FTP Fail -- unsupported file type?

2006-03-08 Thread Steven M. Schweda
It might help to know which version of Wget you're using, and on what you're using it. What's the purpose of the * in your command? Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street

Re: wget option (idea for recursive ftp/globbing)

2006-03-02 Thread Mauro Tortonesi
MagicalTux wrote: Hello, I'm working a lot on website mirroring from various hosting companies, and I noticed that usually, hidden files aren't shown by default when LISTing a directory. This result in some files (.htaccess/.htpasswd) not being mirrored. I did an hugly hack for my

RE: wget option (idea for recursive ftp/globbing)

2006-03-02 Thread Tony Lewis
Mauro Tortonesi wrote: i would like to read other users' opinion before deciding which course of action to take, though. Other users have suggested adding a command line option for -a two or three times in the past: - 2002-11-24: Steve Friedl [EMAIL PROTECTED] submitted a patch - 2002-12-24:

referer not sent on all recursive requests?

2006-02-14 Thread jonah benton
Hi- I'm using wget in some qa scripts to recurse through a site I'm developing to find 404s and 500s and bad resource references. I'm using RHEL4's wget: GNU Wget 1.10.2 (Red Hat modified) I'm running it as per below: wget \ -kpSrN \ -F -i $urls \ -B $base \ -D $domain \

Re: wget 1.10.x fixed recursive ftp download over proxy

2006-01-11 Thread CHEN Peng
= retrieve_url (*t, filename, redirected_URL, NULL, dt); Tony -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of CHEN Peng Sent: Tuesday, January 10, 2006 5:06 PM To: Tony Lewis Cc: [EMAIL PROTECTED] Subject: Re: wget 1.10.x fixed recursive ftp download over

Re: wget 1.10.x fixed recursive ftp download over proxy

2006-01-11 Thread Mauro Tortonesi
is a list of thing that need to be done before releasing 1.10.3, just in case you're interested: - test/fix HTTP code - merge recursive FTP over HTTP proxy patch - merge range patch - fix Content-Disposition support - finish testing suite and i am sure i am forgetting something... -- Aequam memento

Recursive directory exclusion problem

2006-01-10 Thread Werner van der Walt
Hi all, I am trying to mirror a site on my Linux box but want to exclude particular directories contained at random places in the download path. Example: I want to exclude directory XYZ and its contents whenever wget finds it in the current path. site /dir1 /dir2 /XYZ

Re: wget 1.10.x fixed recursive ftp download over proxy

2006-01-10 Thread CHEN Peng
Your simplified code may not work. The intention of patching is to make wget invoke retrieve_tree funtion when it IS FTP and uses proxy, while your code works when it is NOT FTP and uses proxy. On 1/10/06, Tony Lewis [EMAIL PROTECTED] wrote: I believe the following simplified code would have

RE: wget 1.10.x fixed recursive ftp download over proxy

2006-01-09 Thread Tony Lewis
] [mailto:[EMAIL PROTECTED] On Behalf Of CHEN PengSent: Monday, January 09, 2006 12:38 AMTo: [EMAIL PROTECTED]Subject: wget 1.10.x fixed recursive ftp download over proxy Hi, We once encounter an annoying problem of recursively downloading FTP data using wget, through a ftp-over-http proxy

Re: wget from SVN: Issue with recursive downloading from http:// sites

2006-01-09 Thread Mauro Tortonesi
hi steven, thank you very much for your bug report. recently i commited a significant change to the HTTP code in the SVN trunk, but i had no time to thorougly test the code because i am busy working on my ph.d. thesis. i am sure this bug is related to the changes i made. i'll take a deeper

wget from SVN: Issue with recursive downloading from http:// sites

2006-01-05 Thread Steven P. Ulrick
the same way. But I JUST discovered that recursive downloading from ftp domains seems to work perfectly. I am now downloading ftp.crosswire.org, and it looks like it would happily continue until there was no more to download.

Re: wget from SVN: Issue with recursive downloading from http:// sites

2006-01-05 Thread Steven M. Schweda
[...] wget is the SVN version, which is located at /usr/local/bin/wget [...] [...] (/usr/bin/wget is the version of wget that ships with the distro that I run, Fedora Core 3) [...] Results from wget -V would be much more informative than knowing the path(s) to the executable(s). (Should I

Re: wget from SVN: Issue with recursive downloading from http:// sites

2006-01-05 Thread Steven P. Ulrick
On Thu, 5 Jan 2006 08:55:53 -0600 (CST) [EMAIL PROTECTED] (Steven M. Schweda) wrote: [...] wget is the SVN version, which is located at /usr/local/bin/wget [...] [...] (/usr/bin/wget is the version of wget that ships with the distro that I run, Fedora Core 3) [...] Results from wget

Re: wget from SVN: Issue with recursive downloading from http:// sites

2006-01-05 Thread Hrvoje Niksic
[EMAIL PROTECTED] (Steven M. Schweda) writes: Results from wget -V would be much more informative than knowing the path(s) to the executable(s). (Should I know what SVN is?) I believe SVN stands for Subversion, the version control software that runs the repository.

Re: wget from SVN: Issue with recursive downloading from http:// sites

2006-01-05 Thread Steven M. Schweda
Adding -d to your wget commands could also be more helpful in finding a diagnosis. Still true. GNU Wget 1.10.2b built on VMS Alpha V7.3-2 (the original wget 1.10.2 with my VMS-related and other changes) seems to work just fine on that site. You might try starting with a less

Re: wget from SVN: Issue with recursive downloading from http:// sites

2006-01-05 Thread Steven P. Ulrick
) to 1 Setting --recursive (recursive) to 1 Setting --no (noclobber) to 1 DEBUG output created by Wget 1.10+devel on linux-gnu. Enqueuing http://www.afolkey2.net/ at depth 0 Queue count 1, maxcount 1. Dequeuing http://www.afolkey2.net/ at depth 0 Queue count 0, maxcount 1. in http_loop in http_loop

Re: wget from SVN: Issue with recursive downloading from http:// sites

2006-01-05 Thread Steven M. Schweda
Your -d output suggests a defective Wget (probably because Wget/1.10+devel was still in development). A working one spews much more stuff (as it downloads much more stuff). I'd try starting with the last released source kit: http://www.gnu.org/software/wget/

wget can't do recursive retrieval on ftp with proxy

2005-10-13 Thread jmal8295
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 When doing recursive retriaval on ftp, wget (net-misc/wget-1.9.1-r5) is using ftp glob function. That's probably while standard recursion code is not used on ftp. Problem is when later proxy is detected and http code is used instead ... without

Re: with recursive wget status code does not reflect success/failure of operation

2005-09-19 Thread Mauro Tortonesi
Alle 09:06, sabato 17 settembre 2005, Steven M. Schweda ha scritto: I suppose that it's a waste of time and space to point this out here, but native VMS status codes include a severity field (the low three bits), with popular values being (from STSDEF.H): #define STS$K_WARNING 0

Re: with recursive wget status code does not reflect success/failure of operation

2005-09-19 Thread Steven M. Schweda
From: Mauro Tortonesi [EMAIL PROTECTED] Ideally, the values used could be defined in some central location, allowing convenient replacement with suitable VMS-specific values when the time comes. (Naturally, _all_ exit() calls and/or return statements should use one of the pre-defined

Re: with recursive wget status code does not reflect success/failure of operation

2005-09-19 Thread Mauro Tortonesi
Alle 18:06, lunedì 19 settembre 2005, Hrvoje Niksic ha scritto: Mauro Tortonesi [EMAIL PROTECTED] writes: mmh, i don't understand why we should use VMS-specific values in wget. The closest Unix has to offer are these BSD-specific values which few programs use: /* * SYSEXITS.H -- Exit

Re: with recursive wget status code does not reflect success/failure of operation

2005-09-19 Thread Hrvoje Niksic
Mauro Tortonesi [EMAIL PROTECTED] writes: yes, but i was thinking to define wget specific error codes. I wouldn't object to those. The scripting people might find them useful.

Re: recursive and non-clobber

2005-09-16 Thread Niel Drummond
the --recursive non-clobber suppression, and saves files appended with a number, though it would be nice to have some extra combination of commands to re-anable it. Is there a useful way around this 'default' behaviour, without replicating the directory tree locally? not that i know

Re: recursive and non-clobber

2005-09-15 Thread Mauro Tortonesi
files are saved in the same folder. My recompiled wget ignores the --recursive non-clobber suppression, and saves files appended with a number, though it would be nice to have some extra combination of commands to re-anable it. Is there a useful way around this 'default' behaviour, without

with recursive wget status code does not reflect success/failure of operation

2005-09-14 Thread Owen Cliffe
I'm not sure if this is a bug or a feature, but with recursive operation, if a get fails and retrieve_tree bails out then no sensible error codes are returned to main.c (errors are only passed up if the user's quota was full, the URL was invalid or there was a write error) so retrieve_tree always

Re: with recursive wget status code does not reflect success/failure of operation

2005-09-14 Thread Hrvoje Niksic
Owen Cliffe [EMAIL PROTECTED] writes: Is there a good reason why retrieve tree doesn't just return the status of the last failed operation on failure? The original reason (which I don't claim to be good) is because Wget doesn't stop upon on error, it continues. Because of this returning a

Re: with recursive wget status code does not reflect success/failure of operation

2005-09-14 Thread Mauro Tortonesi
Alle 18:58, mercoledì 14 settembre 2005, Hrvoje Niksic ha scritto: Owen Cliffe [EMAIL PROTECTED] writes: Is there a good reason why retrieve tree doesn't just return the status of the last failed operation on failure? The original reason (which I don't claim to be good) is because Wget

recursive and non-clobber

2005-09-01 Thread Niel Drummond
/*/Offer/* -oaug-spider.log -b http://dev.lookfantastic.com/cgi-bin/lf.storefront/ because of the re-occuring endings in the URI's, this command will fail to save some pages, due to the way non-clobber works and that all files are saved in the same folder. My recompiled wget ignores the --recursive

Re: Not detected hyperlink in recursive downloading (wget 1.9.1)

2005-04-27 Thread nemeth
Quoting Hrvoje Niksic [EMAIL PROTECTED]: Dear Hrvoje, You are right, now I also can't reproduce the bug. I just realize that in my second dowloading have missed another file. When I downloaded the pages, the webserver was a little slow, made long (10 s) pauses too. Perhaps it reached the wget

Re: recursive download from pages with ?dir= references

2005-03-05 Thread Hrvoje Niksic
Gabor Istvan [EMAIL PROTECTED] writes: I would like to know if it is possible to mirror or recursively download web sites that have links like ' .php?dir=./ ' within. If yes what are the options to apply? I don't see why that wouldn't work. Something like `wget -r URL' should apply.

selective recursive downloading

2005-01-21 Thread Gabor Istvan
Dear All: I would like to know how could I use wget to selectively download certain subdirectories of a main directory. Here is what I want to do: Let's assume that we have a directory structure like this: http://url.to.download/something/group-a/want-to-download/

RE: selective recursive downloading

2005-01-21 Thread Post, Mark K
, January 21, 2005 9:16 AM To: wget@sunsite.dk Subject: selective recursive downloading Dear All: I would like to know how could I use wget to selectively download certain subdirectories of a main directory. Here is what I want to do: Let's assume that we have a directory structure like this: http

Re: ftp recursive behind a proxy

2004-11-15 Thread apmailist
. Quoting Jesus Villalba [EMAIL PROTECTED]: Dear Sir or madam, I've noticed that the following scenario does not work as I expected: My client (wget) is behind a ftp-proxy and I try to download recursive from a ftp server (anyone) with the following command: wget --follow-ftp -r -l 2 -p -k ftp

Referer and recursive wgetting

2004-11-15 Thread Adam
I've discovered that wget doesn't process the --referer option correctly when doing recursive wgets. Instead of changing the Referer HTTP field to that specified, wget makes it the original page that was requested through wget. I think this behaviour is incorrect. If it isn't, could you please

ftp recursive behind a proxy

2004-11-10 Thread Jesus Villalba
Dear Sir or madam, I've noticed that the following scenario does not work as I expected: My client (wget) is behind a ftp-proxy and I try to download recursive from a ftp server (anyone) with the following command: wget --follow-ftp -r -l 2 -p -k ftp://ftp-server in wgetrc I have the following

Re: ftp recursive behind a proxy

2004-11-10 Thread Mauro Tortonesi
Alle 12:22, mercoledì 10 novembre 2004, hai scritto: Dear Sir or madam, I've noticed that the following scenario does not work as I expected: My client (wget) is behind a ftp-proxy and I try to download recursive from a ftp server (anyone) with the following command: wget --follow-ftp -r

Re: Problem with large recursive downloads

2004-11-07 Thread Christian Larsen
Yes. The problem can be reproduced. I've tried to fetch this site with theese parameters several times, and it always locks up (but not on the same place). The strange thing is that wget doesn't stop running (I can find it with ps aux | grep wget), but it just doesn't do anything... On 06-11-04

Problem with large recursive downloads

2004-11-06 Thread Christian Larsen
Hello. I'm using wget to download sites for a java-project. Wget is run with the RunTime-class in java. Everything has been working fine until I tried to download really large sites. My example was www.startsiden.no (a Norwegian web-portal with a large amount of external linking) with a depth of

Re: Problem with large recursive downloads

2004-11-06 Thread Mauro Tortonesi
Alle 13:03, sabato 6 novembre 2004, Christian Larsen ha scritto: Hello. I'm using wget to download sites for a java-project. Wget is run with the RunTime-class in java. Everything has been working fine until I tried to download really large sites. My example was www.startsiden.no (a Norwegian

bug regarding recursive retrieval of login/pass protected websites - bad referer

2004-10-31 Thread Martin Vogel
Hi, i'm using wget 1.9.1 andgot aproblem: when using wget -r -d --referer='http://domain.invalid/login.htm' 'http://user:[EMAIL PROTECTED]://domain.invalid/members/'the first request is properly, but the third and following(second is the one for robots.txt) sends an incorrect referer:

Recursive Accept/Reject outside filename not working. Workaround?

2004-09-14 Thread John Clarke
Hello, It seems that wget ignores anything after the filename when processing its accept/reject parameters. For example, if I want to recursively accept only IRLs with _bar at the end starting with the following IRL: http://mysite.org/my.dll?param1=foo_bar (accepted)

wget -r ist not recursive

2004-09-13 Thread Helmut Zeisel
I tried wget -r -np -nc http://www.vatican.va/archive/DEU0035/_FA.HTM both with cygwin / Wget 1.9.1 and Linux / Wget 1.8.2. They return just one single file but none of http://www.vatican.va/archive/DEU0035/_FA1.HTM http://www.vatican.va/archive/DEU0035/_FA2.HTM etc. which are referenced on

Re: wget -r ist not recursive

2004-09-13 Thread Jens Rösner
Hi Helmut! I suspect there is a robots.txt that says index, no follow Try wget -nc -r -l0 -p -np -erobots=off http://www.vatican.va/archive/DEU0035/_FA.HTM; it works for me. -l0 says: infinite recursion depth -p means page requisites (not really necessary) -erobots=off orders wget to ignore any

Re: wget -r ist not recursive

2004-09-13 Thread Helmut Zeisel
Hi Helmut! Try wget -nc -r -l0 -p -np -erobots=off http://www.vatican.va/archive/DEU0035/_FA.HTM; it works for me. It also works for me. Thank you, Helmut -- Supergünstige DSL-Tarife + WLAN-Router für 0,- EUR* Jetzt zu GMX wechseln und sparen http://www.gmx.net/de/go/dsl

recursive ftp and proxy

2004-07-23 Thread Eric Leblond
Hi, I was trying to recursively fetch an ftp url but it did not work as I use a proxy (squid). I see a patch from Henrik Nordstrom : http://www.squid-cache.org/mail-archive/squid-users/200304/0570.html It allow to do things like : wget -rl1 --proxy on --follow-ftp ftp://ftp.lip6.fr/ which

Re: recursive and form posts in wget 1.9.1

2004-05-27 Thread Hrvoje Niksic
M P [EMAIL PROTECTED] writes: I'm also trying to automatically login to https://online.wellsfargo.com/cgi-bin/signon.cgi using wget but with no luck so far. Any ideas to get this working is greatly appreciated. I'm finding it hard to try this out, but I *think* that a combination of

Re: recursive and form posts in wget 1.9.1

2004-05-26 Thread M P
I'm also trying to automatically login to https://online.wellsfargo.com/cgi-bin/signon.cgi using wget but with no luck so far. Any ideas to get this working is greatly appreciated. Thanks. PM * From: Greg Underwood * Subject: Re: recursive and form posts in wget 1.9.1 * Date: Mon

Recursive

2004-02-04 Thread Jeffrey Rosenfeld
I am using wget to retrieve large amounts of genetic sequence from various sites. In order to make this more efficient, I am using time stamping, but for some reason the time-stamping does not seem to work for one of my queries. This query : wget --non-verbose --timestamping

Re: Recursive

2004-02-04 Thread Hrvoje Niksic
I don't see any obvious reason why timestamping would work in one case, but not in the other. One possible explanation might be that the second server does not provide correct time-stamping data. Debug output (with the `-d' switch) might shed some light on this.

Re: recursive and form posts in wget 1.9.1

2004-01-29 Thread Hrvoje Niksic
Greg Underwood [EMAIL PROTECTED] writes: On Tuesday 27 January 2004 05:23 pm, Hrvoje Niksic wrote: Greg Underwood [EMAIL PROTECTED] writes: I took a peek at my cookies while logging into the site in a regular browser. It definitely adds a session cookie when I log in, I think your

Re: recursive and form posts in wget 1.9.1

2004-01-28 Thread Greg Underwood
On Tuesday 27 January 2004 05:23 pm, Hrvoje Niksic wrote: Greg Underwood [EMAIL PROTECTED] writes: I took a peek at my cookies while logging into the site in a regular browser. It definitely adds a session cookie when I log in, I think your problem should be solvable with

Re: recursive and form posts in wget 1.9.1

2004-01-27 Thread Hrvoje Niksic
Greg Underwood [EMAIL PROTECTED] writes: I took a peek at my cookies while logging into the site in a regular browser. It definitely adds a session cookie when I log in, I think your problem should be solvable with `--keep-session-cookies'. The server will have no way of knowing that the two

Re: recursive and form posts in wget 1.9.1

2004-01-26 Thread Greg Underwood
Nicolas, Thanks for the tip. I took a peek at my cookies while logging into the site in a regular browser. It definitely adds a session cookie when I log in, but when I just browse to the login page, it doesn't appear to be adding a session cookie. There's a site cookie there, but I don't

recursive and form posts in wget 1.9.1

2004-01-25 Thread Greg Underwood
on the first wget call the cookies.txt file is invariably empty (save the wget header). Maybe something to do with the SSL connection? I don't know, I'm just taking shots in the dark there. I think I may need to be able to do something like: wget --recursive --post-data=name=mepassword=foo

Recursive ftp

2003-12-07 Thread Gisle Vanem
Some minor issues with recursive ftp. If all extensions are rejected (no file in .listing are accepted), Wget still issues a PORT and an empty RETR command. Is this WAD (working as designed)? E.g. wget -r -Ahtm ftp://host/foo/ when I really intended -Ahtml It would be nice if Wget could say

Re: Recursive ftp broken

2003-11-26 Thread Gisle Vanem
Interestingly, I can't repeat this. Still, to be on the safe side, I added some additional restraints to the code that make it behave more like the previous code, that worked. Please try again and see if it works now. If not, please provide some form of debugging output as well. This

Re: Recursive ftp broken

2003-11-25 Thread Hrvoje Niksic
Thanks for the report, this is most likely caused by my recent changes that eliminate rbuf* from the code. (Unfortunately, the FTP code kept some state in struct rbuf, and my changes might have broken things.) To be absolutely sure, see if it works under 1.9.1 or under CVS from one week ago.

Re: Recursive ftp broken

2003-11-25 Thread Hrvoje Niksic
Gisle Vanem [EMAIL PROTECTED] writes: [...] == SYST ... done.== PWD ... done. ! is '/' here == TYPE I ... done. == CWD not required. == PORT ... done.== RETR BAN-SHIM.ZIP ... No such file `BAN-SHIM.ZIP'. ... Interestingly, I can't repeat this. Still, to be on the safe side,

Recursive ftp broken

2003-11-22 Thread Gisle Vanem
I don't know when it happened, but latest CVS version breaks recursive ftp download. I tried with this: wget -rAZIP ftp://ftp.mpoli.fi/pub/software/DOS/NETWORK/ and the result is: --20:46:02-- ftp://ftp.mpoli.fi/pub/software/DOS/NETWORK/ = `ftp.mpoli.fi/pub/software/DOS/NETWORK

recursive wget don't work properly

2003-11-16 Thread Marius Andreiana
Hi wget -r --no-parent http://www.desktoplinux.com/files/article003/ stops at the first slides (4-6) instead of getting all presentation. -- Marius Andreiana Galuna - Soluii Linux n Romnia http://www.galuna.ro

Re: recursive wget don't work properly

2003-11-16 Thread Hrvoje Niksic
Marius Andreiana [EMAIL PROTECTED] writes: wget -r --no-parent http://www.desktoplinux.com/files/article003/ stops at the first slides (4-6) instead of getting all presentation. Which version of Wget are you using? Can you send the debug output from the Wget run?

Re: recursive wget don't work properly

2003-11-16 Thread Marius Andreiana
Hi On Du, 2003-11-16 at 23:08, Hrvoje Niksic wrote: Which version of Wget are you using? Can you send the debug output from the Wget run? GNU Wget 1.8.2 (Fedora Core 1). Attached debuged output. As you see, it won't get at slide 7. Thank you! -- Marius Andreiana Galuna - Solutii Linux in

Re: recursive wget don't work properly

2003-11-16 Thread Hrvoje Niksic
Marius Andreiana [EMAIL PROTECTED] writes: On Du, 2003-11-16 at 23:08, Hrvoje Niksic wrote: Which version of Wget are you using? Can you send the debug output from the Wget run? GNU Wget 1.8.2 (Fedora Core 1). Attached debuged output. As you see, it won't get at slide 7. Maximum

Problem recursive download

2003-10-16 Thread Sergey Vasilevsky
I use wget 1.8.2 Try recursive downdload www.map-by.info/index.html, but wget stop in first page. Why? index.html have links to another page. /usr/local/bin/wget -np -r -N -nH --referer=http://map-by.info -P /tmp/www.map-by.info -D map-by.info http://map-by.info http://www.map-by.info --10:09:25

RE: Problem recursive download

2003-10-16 Thread Sergey Vasilevsky
AM To: [EMAIL PROTECTED] Subject: Problem recursive download I use wget 1.8.2 Try recursive downdload www.map-by.info/index.html, but wget stop in first page. Why? index.html have links to another page. /usr/local/bin/wget -np -r -N -nH --referer=http://map-by.info -P /tmp/www.map

Re: Problem recursive download

2003-10-16 Thread Hrvoje Niksic
This seems to work in my copy of 1.8.2. Perhaps you have something in your .wgetrc that breaks things?

Re: Problem recursive download

2003-10-16 Thread Hrvoje Niksic
Sergey Vasilevsky [EMAIL PROTECTED] writes: I think wget strong verify link syntax: a href=about_rus.html onMouseOver=img_on('main21'); onMouseOut=img_off('main21') That link have incorrect symbol ';' not quoted in a You are right. However, this has been fixed in Wget 1.9-beta, which

Re: wget 1.9 - behaviour change in recursive downloads

2003-10-07 Thread Jochen Roderburg
--recursive retrieval would make no sense otherwise. J. Roderburg

  1   2   >