By the way, as you might have noticed I wanted to exchange the real domain names with example.com, but forgot to exchange the last argument. :)

>>   --domains='www.example.com,a.example.com,b.example.com'
>>   --user-agent='Example'
>>   --output-file='example.log'
>>   'www.euroskop.cz'

So, just for the record, the real --domains value was 'www.euroskop.cz,www2.euroskop.cz,rozcestnik.euroskop.cz'.

In this case, that doesn't change the output, tho.

Have a nice day,
Stefan

Am 28.08.2006 16:44, Mauro Tortonesi schrieb:
Stefan Melbinger ha scritto:
Hello everyone,

I'm having troubles with the newest trunk version of wget (revision 2187).

Command-line arguments:

wget
  --recursive
  --spider
  --no-parent
  --no-directories
  --follow-ftp
  --retr-symlinks
  --no-verbose
  --level='2'
  --span-hosts
  --domains='www.example.com,a.example.com,b.example.com'
  --user-agent='Example'
  --output-file='example.log'
  'www.euroskop.cz'

Results in:

wget: url.c:1934: getchar_from_escaped_string: Assertion `str && *str' failed.
Aborted

Can somebody reproduce this problem? Am I using illegal combinations of arguments? Any ideas?

(Worked before the newest patch.)

it's really weird. with this command:

wget -d --verbose --recursive --spider --no-parent --no-directories --follow-ftp --retr-symlinks --level='2' --span-hosts --user-agent='Mozilla/5.001 (windows; U; NT4.0; en-us) Gecko/25250101' --domains='www.example.com,a.example.com,b.example.com' http://www.euroskop.cz/

i get:

---response begin---
HTTP/1.0 200 OK
Date: Mon, 28 Aug 2006 14:35:14 GMT
Content-Type: text/html
Expires: Mon, 28 Aug 2006 14:35:14 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Server: Apache/1.3.26 (Unix) Debian GNU/Linux CSacek/2.1.9 PHP/4.1.2
X-Powered-By: PHP/4.1.2
Pragma: no-cache
Set-Cookie: PHPSESSID=b8af8e220f5f1f7321b86ce0524f88b2; expires=Tue, 29-Aug-06 14:35:14 GMT; path=/
Via: 1.1 proxy (NetCache NetApp/5.6.2R1)

---response end---
200 OK

Stored cookie www.euroskop.cz -1 (ANY) / <permanent> <insecure> [expiry 2006-08-29 16:35:14] PHPSESSID b8af8e220f5f1f7321b86ce0524f88b2
Length: unspecified [text/html]
Closed fd 3
200 OK

index.html: No such file or directory

FINISHED --16:37:42--
Downloaded: 0 bytes in 0 files


it seems there is a weird interaction between cookies and the recursive spider algorithm that makes wget bail out. i'll have to investigate this.


PS: Just FYI, when I compile I get the following warnings:

http.c: In function `http_loop':
http.c:2425: warning: implicit declaration of function `nonexisting_url'

main.c: In function `main':
main.c:1009: warning: implicit declaration of function `print_broken_links'

recur.c: In function `retrieve_tree':
recur.c:279: warning: implicit declaration of function `visited_url'

fixed, thanks.


Reply via email to