Hello,
Probably I am just too lazy, haven't spent enough time to read the man, and
wget can actually do exactly what I want.
If so -- I do apologize for taking your time.
Otherwise: THANKS for your time!..:-).
My problem is:
redirects.
I am trying to catch them by using, say, netcat ... or writing some simple
pieces of software -- sending HTTP GET and catching the "Location:" in
response. What I've found out is that (obviously) wget is wa-aaaaaaaaay more
sophisticated and can do much better job, especially in certain cases.
I started using it by basically catching stderr from wget [params my_urls]
and then parsing it -- looking for the "^Location: " pattern.
Works great.
The downside is: performance.
You see, I don't need the actual content, -- only the canonical URL. But
wget just wgets it - no matter what.
As long as (from my perspective) this is a case of "If Wget does not behave
as documented, it's a bug." -- according to man, -- I am taking a liberty to
'file a bug'.
(The "expected" behavior I'm talking about is this: if I use
"--spider", I expect wget do nothing after finding the server -- like
sending GET to the server and getting HTML back).
That's my bug - and/or a feature I'd really like to have. An alternative
would be: adding --some_flag=n, meaning "receive no more than n lines of
html").
Do you think that this could be a useful feature that other people would
probably love too?...
Thanks for your time and for a great tool,
Vlad.