Then your problem isn't with wget. Once you figure out how to access the
file in a web browser, use the same URL in wget.
Tony
- Original Message -
From: Bufford, Benjamin (AGRE) [EMAIL PROTECTED]
To: Tony Lewis [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Wednesday, May 26, 2004 8:41 AM
[EMAIL PROTECTED] wrote:
Well, I found out a little bit more about the
real reason for the problem. Opera has a very
convenient option called Encode International
Web Addresses with UTF-8. When I had this
option checked, it could retrieve the file
without problems. Without this option
[EMAIL PROTECTED] wrote:
I came across a problem accessing an FTP site where
the password contained a @ sign. The password was
[EMAIL PROTECTED] So I tried the following:
wget -np --server-response -H --tries=1 -c
--wait=60 --retry-connrefused -R *
ftp://guest:[EMAIL
Juhana Sadeharju wrote:
I placed use_proxy = off to .wgetrc (which file I did not have earlier)
and to ~/wget/etc/wgetrc (which file I had), and tried
wget --proxy=off http://www.maqamworld.com
and it still does not work.
Could there be some system wgetrc files somewhere? I have compiled
running older OS versions to confirm that.) For
example, on my Windows XP machine, I have to following variables:
HOMEDRIVE=C:
HOMEPATH=\Documents and Settings\Tony Lewis
so my home directory is C:\Documents and Settings\Tony Lewis
HTH,
Tony
robi sen wrote:
Hi I have a client who basically needs to regularly grab content from
part of their website and mirror or it and or save it so they can
disseminate it as HTML on a CD. The website though is written in
ColdFusion as requires application level authentication which is just
form
Kazu Yamamoto wrote:
Since I have experiences to modify IPv4 only programs, including FTP
and HTTP, to IPv6-IPv4 one, I know this problem. Yes, some part of
wget *would* remain protocol dependent.
Kazu, it's been said that a picture is worth a thousand words. Perhaps in
this case, a patch
Anurag Jain wrote:
downloading a bin big file(268MB) using wget command on our solrise
box using
wget http url/bin filename which located on some webserver it start
downloading
it and after 42% it give a msg no disk space available and it get
stopped. although i
check on sever lot more
Kazu Yamamoto wrote:
Thank you for supporting IPv6 in wget v 1.9.1. Unfortunately, wget v
1.9.1 does not work well, at least, on NetBSD.
NetBSD does not allow to use IPv4-mapped IPv6 addresses for security
reasons. To know the background of this, please refer to:
YOSHIFUJI Hideaki wrote:
NetBSD etc. is NOT RFC compliant here, however, it would be better if one
supports wider platforms / configurations.
My patch is quick hack'ed, but I believe that it should work for NetBSD
and FreeBSD 5. Please consider applying it.
It's not my call as to whether
Gisle Vanem wrote:
I've searched google and the only way AFAICS to get redirection
in a GUI app to work is to create 3 pipes. Then use a thread (or
run_with_timeout with infinite timeout) to read/write the console
handles to put/get data into/from the parent's I/O handles. I don't
fully
- Original Message -
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, December 07, 2003 8:04 AM
Subject: wget Suggestion: ability to scan ports BESIDE #80, (like 443)
Anyway Thanks for WGET!
What's wrong with wget https://www.somesite.com ?
Danny Linkov wrote:
I'd like to download recursively the content of a web
directory WITHOUT AN INDEX file.
What shows up in your web browser if you enter the directory (such as
http://www.somesite.com/dir/)?
The most common responses are:
* some HTML file selected by the server (often
[EMAIL PROTECTED] wrote:
I am not sure if this is a bug, but it's really out of my expectation.
Here is the way to reproduce the problem.
1. Put the URL http://ichart.yahoo.com/b?s=CSCO into the browser and
then drag out the image. It should be a file with .png extension. So I
believe this
antonio taylor wrote:
http://fisrtname lastname:[EMAIL PROTECTED]
Have you tried http://fisrtname%20lastname:[EMAIL PROTECTED] ?
Hrvoje Niksic wrote:
Have you seen the rest of the discussion? Would it do for you if Wget
correctly handled something like:
wget --header='Host: jidanni.org' http://216.46.192.85/
I think that is an elegant solution.
Tony
Hrvoje Niksic wrote:
Assume that Wget has retrieved a document from the host A, which
hasn't closed the connection in accordance with Wget's keep-alive
request.
Then Wget needs to connect to host B, which is really the same as A
because the provider uses DNS-based virtual hosts. Is it OK
Hrvoje Niksic wrote:
The thing is, I don't want to bloat Wget with obscure options to turn
off even more obscure (and *very* rarely needed) optimizations. Wget
has enough command-line options as it is. If there are cases where
the optimization doesn't work, I'd rather omit it completely.
Hrvoje Niksic wrote:
I'm curious... is anyone using the patch list to track development?
I'm posting all my changes to that list, and sometimes it feels a lot
like talking to myself. :-)
I read the introductory stuff to see what's changed, but I never extract the
patches from the messages.
Hrvoje Niksic wrote:
Incidentally, Wget is not the only browser that has a problem with
that. For me, Mozilla is simply showing the source of
http://www.minskshop.by/cgi-bin/shop.cgi?id=1cookie=set, because
the returned content-type is text/plain.
On the other hand, Internet Explorer will
Philip Mateescu wrote:
A warning message would be nice when for not so obvious reasons wget
doesn't behave as one would expect.
I don't know if there are other tags that could change wget's behavior
(like -r and meta name=robots do), but if they happen it would be
useful to have a message.
Hrvoje Niksic wrote:
I'm about to release 1.9 today, unless it takes more time to upload it
to ftp.gnu.org.
If there's a serious problem you'd like fixed in 1.9, speak up now or
be silent until 1.9.1. :-)
I thought we were going to turn our attention to 1.10. :-)
I'm trying to figure out how to do a POST followed by a GET.
If I do something like:
wget http://www.somesite.com/post.cgi --post-data 'a=1b=2'
http://www.somesite.com/getme.html -d
I get the following behavior:
POST /post.cgi HTTP/1.0
snip
[POST data: a=1b=2]
snip
POST /getme.html HTTP/1.0
Hrvoje Niksic wrote:
Maybe the right thing would be for `--post-data' to only apply to the
URL it precedes, as in:
wget --post-data=foo URL1 --post-data=bar URL2 URL3
snip
But I'm not at all sure that it's even possible to do this and keep
using getopt!
I'll start by saying that I
Hrvoje Niksic wrote:
I like these suggestions. How about the following: for 1.9, document
that `--post-data' expects one URL and that its behavior for multiple
specified URLs might change in a future version.
Then, for 1.10 we can implement one of the alternative behaviors.
That works for
Hrvoje and I have had an off-list dialogue about this subject. We've settled
on HUR-voy-eh as the closest phonetic rendition of his name for English
speakers. It helps to remember that the r is rolled.
Tony
I've been on this list for a couple of years now and I've always wondered
how our illustrious leader pronounces his name.
Can you give us linguistically challenged Americans a phonetic rendition of
your name?
Tony Lewis (toe knee loo iss)
Hrvoje Niksic wrote:
Please be aware that Wget needs to know the size of the POST data
in advance. Therefore the argument to @code{--post-file} must be
a regular file; specifying a FIFO or something like
@file{/dev/stdin} won't work.
There's nothing that says you have to
Hrvoje Niksic wrote:
I don't understand what you're proposing. Reading the whole file in
memory is too memory-intensive for large files (one could presumably
POST really huge files, CD images or whatever).
I was proposing that you read the file to determine the length, but that was
on the
Hrvoje Niksic wrote:
That would work for short streaming, but would be pretty bad in the
mkisofs example. One would expect Wget to be able to stream the data
to the server, and that's just not possible if the size needs to be
known in advance, which HTTP/1.0 requires.
One might expect it,
Suhas Tembe wrote:
1). I go to our customer's website every day log in using a User Name
Password.
[snip]
4). I save the source to a file subsequently perform various tasks on
that file.
What I would like to do is automate this process of obtaining the source
of a page using wget. Is
Hrvoje Niksic wrote:
I'm curious: what is the use case for this? Why would you want to
save the unfollowed links to an external file?
I use this to determine what other websites a given website refers to.
For example:
wget
Daniel Stenberg wrote:
The GNU project is looking for a new maintainer for wget, as the current
one
wishes to step down.
I think that means we need someone who:
1) is proficient in C
2) knows Internet protocols
3) is willing to learn the intricacies of wget
4) has the time to go through
Rajesh wrote:
Wget is not mirroring the web site properly. For eg it is not copying
symbolic
links from the main web server.The target directories do exist on the
mirror
server.
wget can only mirror what can be seen from the web. Symbolic links will be
treated as hard references (assuming
Rajesh wrote:
Thanks for your reply. I have tried using the command wget
--user-agent=Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1), but it
didn't
work.
Adding the user agent helps some people -- I think most often with web
servers from the evil empire.
I have one more question. In
I tried to retreive a URL with Internet Explorer and it continued to
retrieve the URL forever. I tried to grab that same URL with wget, which
tried twice and then reported redirection cycle detected.
Perhaps we should send the wget code to someone in Redmond.
Tony
Aaron S. Hawley wrote:
why not just have the default wget behavior follow comments explicitly
(i've lost track whether wget does that or needs to be ammended) /and/
have an option that goes /beyond/ quirky comments and is just
--ignore-comments ? :)
The issue we've been discussing is what to
Aaron S. Hawley wrote:
i'm just saying what's going to happen when someone posts to this list:
My Web Pages have [insert obscure comment format] for comments and Wget
is considering them to (not) be comments. Can you change the [insert
Wget comment mode] comment mode to (not) recognize my
Georg Bauhaus wrote:
I don't think so. Actually the rules for SGML comments are
somewhat different.
Georg, I think we're talking about apples and oranges here. I'm talking
about what is legitimate in a comment in an SGML document. I think you're
talking about what is legitimate as a comment
George Prekas wrote:
You are probably right. I have pointed this because I have seen pages that
use as a separator !-- with lots of dashes and althrough
Internet Explorer shows the page, wget can not download it correctly. What
do think about finishing the comment at the ?
After
George Prekas wrote:
I have found a bug in Wget version 1.8.2 concerning comment handling (
!--
comment -- ). Take a look at the following illegal HTML code:
HTML
BODY
a href=test1.htmltest1.html/a
!--
a href=test2.htmltest2.html/a
!--
/BODY
/HTML
Now, save the above snippet as
Dick Penny wrote:
I have just successfully used WGET on a single file download. I even
figured
out how to specify a destination. But, I cannot seem to get wildcards
to
work. Help please:
wget -o log.txt -P c:/Documents and Settings/Administrator/My
Documents/CME_data/bt
Dan Mahoney, System Admin wrote:
Assume I have a site that I want to create a static mirror of. Normally
this site is database driven, but I figure if I spider the entire site,
and map all the GET URLS to static urls I can have a full mirror. Has
anyone known of this being successfully
Ryan Underwood wrote:
It seems that some servers are broken and in order to fetch files with
certain
filenames, some characters that are normally encoded in HTTP sequences
must
be sent through unencoded. For example, I had a server the other day that
I was fetching files from at the URL:
Johannes Berg wrote:
Maybe this isn't really a bug in wget but rather in the file, but since
this is standard as exported from MS Word I'd like to see wget recognize
the images and download them.
Microsoft Word claims to create a valid HTML file. In fact, what it creates
can only reliably be
cyprien wrote:
I want to mirror my homesite, everything works fine expect one :
my site is a photo site based on php scipts : gallery
(http://gallery.sourceforge.net)
it have also some javascripts script...
[snip]
what can i do to have that (on mirror site) :
You cannot because wget does
Nandita Shenvi wrote:
I have not copied the whole script but just the last few lines.The
variable
$all_links[3] has an URL:
http://bolinux39.europe.nokia.com/database2/MIDI100/GS001/01FINALC.MID.
the link follows a file, which I require.
I remove the http:// before calling the wget, but i
Frank Helk wrote:
Free (web based) scanning is available at http://www.antivirus.com.
Select Free tools in the top menu and then Scan Your PC, Free from
the list. You'll not even have to register to use it. Please.
It may not be so simple. Klez uses anti-anti-virus techniques to prevent
Herold Heiko wrote:
It would be better imho if the options itself are modified, in that case
the
variable option wouldn't be necessary, supposing we keep the and :, this
could be
--@http-passwd=passwd.txt --:proxy-passwd=0
It seems to me that a convention like this should be adopted (or
Brix Lichtenberg wrote:
But I'm still getting three or more virus mails with attachments 100k+
daily from the wget lists and they're blocking my mailbox (dial-up). And
getting those dumb system warnings accompanying them doesn't make it
better. Isn't there really no way to stop that (at
Hrvoje Niksic wrote:
If your point is that Wget should print a warning when it can *prove*
that the Content-Length data it received was faulty, as in the case of
having received more data, I agree. We're already printing a similar
warning when Last-Modified is invalid, for example.
I'm
Maciej W. Rozycki wrote:
Hmm, it's too fragile in my opinion. What if a new version of Apache
defines a new format?
I think all of the expressions proposed thus far are too fragile. Consider
the following URL:
http://www.google.com/search?num=100q=%2Bwget+-GNU
The regular expression needs
Maciej W. Rozycki wrote:
I'm not sure what you are referring to. We are discussing a common
problem with static pages generated by default by Apache as index.html
objects for server's filesystem directories providing no default page.
Really? The original posting from Jamie Zawinski said:
Hrvoje Niksic wrote:
Is there any way to make Wget use HTTP/1.1 ?
Unfortunately, no.
In looking at the debug output, it appears to me that wget is really sending
HTTP/1.1 headers, but claiming that they are HTTP/1.0 headers. For example,
the Host header was not defined in RFC 1945, but wget
Hrvoje Niksic wrote:
The one remaining problem is the ETA. Based on the current speed, it
changes value wildly. Of course, over time it is generally
decreasing, but one can hardly follow it. I removed the flushing by
making sure that it's not shown more than once per second, but this
Hrvoje Niksic wrote:
I'll grab the other part and explain what curl does. It shows a
current
speed based on the past five seconds,
Does it mean that the speed doesn't change for five seconds, or that
you always show the *current* speed, but relative to the last five
seconds? I may be
Andre Majorel wrote:
Yes, that allows me to specify _A_ referrer, like www.aol.com. When I'm
trying to help my users mirror their old angelfire pages or something
like
that, very often the link has to come from the same directory. I'd like
to see something where when wget follows a
Ian Abbott wrote:
For example, a recursive retrieval on a page like this:
html
body
script
a href=foo.htmlfoo/a
/script
/body
/html
will retrieve foo.html, regardless of the script.../script
tags.
We seem to be talking about two completely different things, Ian. A
Csaba Ráduly wrote:
I see that wget handles SCRIPT with tag_find_urls, i.e. it tries to
parse whatever it's inside.
Why was this implemented ? JavaScript is most
used to construct links programmatically. wget is likely to find
bogus URLs until it can properly parse JavaScript.
wget is
I wrote:
wget is parsing the attributes within the script tag, i.e., script
src=url. It does not examine the content between script and
/script.
and Ian Abbott responded:
I think it does, actually, but that is mostly harmless.
You're right. What I meant was that it does not examine the
Daniel Stenberg responded to my original suggestion:
With this information, any time that wget encounters a form whose action
is
/cgi-bin/auth.cgi, it will enqueue the submission of the form using
the
values provided for the fields id and pw.
Now, why would wget do this?
There are many
101 - 161 of 161 matches
Mail list logo