subject:"WGET bug..."

Re: WGET bug...

2008-07-11 Thread Micah Cowan

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

HARPREET SAWHNEY wrote:
 Hi,
 
 I am getting a strange bug when I use wget to download a binary file
 from a URL versus when I manually download.
 
 The attached ZIP file contains two files:
 
 05.upc --- manually downloaded
 dum.upc--- downloaded through wget
 
 wget adds a number of ascii characters to the head of the file and seems
 to delete a similar number from the tail.
 
 So the file sizes are the same but the addition and deletion renders
 the file useless.
 
 Could you please direct me on if I should be using some specific
 option to avoind this problem?

In the future, it's useful to mention which version of Wget you're using.

The problem you're having is that the server is adding the extra HTML at
the front of your session, and then giving you the file contents anyway.
It's a bug in the PHP code that serves the file.

You're getting this extra content because you are not logged in when
you're fetching it. You need to have Wget send a cookie with an
login-session information, and then the server will probably stop
sending the corrupting information at the head of the file. The site
does not appear to use HTTP's authentication mechanisms, so the
[EMAIL PROTECTED] bit in the URL doesn't do you any good. It uses
Forms-and-cookies authentication.

Hopefully, you're using a browser that stores its cookies in a text
format, or that is capable of exporting to a text format. In that case,
you can just ensure that you're logged in in your browser, and use the
- --load-cookies=cookies.txt option to Wget to use the same session
information.

Otherwise, you'll need to use --save-cookies with Wget to simulate the
login form post, which is tricky and requires some understanding of HTML
Forms.

- --
HTH,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer,
and GNU Wget Project Maintainer.
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFId9Vy7M8hyUobTrERAjCWAJ9niSjC5YdBDNcAbnBFWZX6D8AO7gCeM8nE
i8jn5i5Y6wLX1g3Q2hlDgcM=
=uOke
-END PGP SIGNATURE-

Re: WGET bug...

2008-07-11 Thread Micah Cowan

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

HARPREET SAWHNEY wrote:
 Hi,
 
 Thanks for the prompt response.
 
 I am using
 
 GNU Wget 1.10.2
 
 I tried a few things on your suggestion but the problem remains.
 
 1. I exported the cookies file in Internet Explorer and specified
 that in the Wget command line. But same error occurs.
 
 2. I have an open session on the site with my username and password.
 
 3. I also tried running wget while I am downloading a file from the
 IE session on the site, but the same error.

Sounds like you'll need to get the appropriate cookie by using Wget to
login to the website. This requires site-specific information from the
user-login form page, though, so I can't help you without that.

If you know how to read some HTML, then you can find the HTML form used
for posting username/password stuff, and use

wget --keep-session-cookies --save-cookies=cookies.txt \
- --post-data='username=foopassword=bar' ACTION

Where ACTION is the value of the form's action field, USERNAME and
PASSWORD (and possibly further required values) are field names from the
HTML form, and FOO and BAR is the username/password.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer,
and GNU Wget Project Maintainer.
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFId+w97M8hyUobTrERAmLsAJ91231iGeO/albrgRuuUCRp8zFcnwCgiX3H
fDp2J2oTBKlxW17eQ2jaCAA=
=Khmi
-END PGP SIGNATURE-

[fwd] Wget Bug: recursive get from ftp with a port in the url fails

2007-09-17 Thread Hrvoje Niksic

---BeginMessage---
Hi,I am using wget 1.10.2 in Windows 2003.And the same problem like Cantara.
The file system is NTFS.
Well I find my problem is, I wrote the command in schedule tasks like this:

wget  -N -i D:\virus.update\scripts\kavurl.txt -r -nH -P
d:\virus.update\kaspersky

well, after wget,and before -N, I typed TWO spaces.

After delete one space, wget works well again.

Hope this can help.

:)

-- 
from:baalchina
---End Message---

Re: [fwd] Wget Bug: recursive get from ftp with a port in the url fails

2007-09-17 Thread Micah Cowan

Hrvoje Niksic wrote:
 Subject:
 Re: Wget Bug: recursive get from ftp with a port in the url fails
 From:
 baalchina [EMAIL PROTECTED]
 Date:
 Mon, 17 Sep 2007 19:56:20 +0800
 To:
 [EMAIL PROTECTED]

 To:
 [EMAIL PROTECTED]

 Message-ID:
 [EMAIL PROTECTED]
 MIME-Version:
 1.0
 Content-Type:
 multipart/alternative; boundary===-=-=

 Hi,I am using wget 1.10.2 in Windows 2003.And the same problem like
 Cantara. The file system is NTFS.
 Well I find my problem is, I wrote the command in schedule tasks like this:

 wget  -N -i D:\virus.update\scripts\kavurl.txt -r -nH -P
 d:\virus.update\kaspersky

 well, after wget,and before -N, I typed TWO spaces.

 After delete one space, wget works well again.

 Hope this can help.

 :)

Hi baalchina,

Hrvoje forwarded your message to the Wget discussion mailing list, where
such questions are really more appropriate, especially since Hrvoje is
not maintaining Wget any longer, but has left that responsibility for
others.

What you're describing does not appear to be a bug in Wget; it's the
shell's (or task scheduler's, or whatever) responsibility to split
space-separated elements properly; the words are supposed to already be
split apart (properly) by the time Wget sees it.

Also, you didn't really describe what was going wrong with Wget, or what
message about it's failure you were seeing (perhaps you'd need to
specify a log file with -o log, or via redirection of the command
interpreter supports it). However, if the problem is that Wget was
somehow seeing the space, as a separate argument or as part of another
one, then the bug lies with your task scheduler (or whatever is
interpreting the command line).

-- 
HTH,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

signature.asc
Description: OpenPGP digital signature

Re: wget bug?

2007-07-09 Thread Mauro Tortonesi

On Mon, 9 Jul 2007 15:06:52 +1200
[EMAIL PROTECTED] wrote:

 wget under win2000/win XP
 I get No such file or directory error messages when using the follwing 
 command line.
 
 wget -s --save-headers 
 http://www.nndc.bnl.gov/ensdf/browseds.jsp?nuc=%1class=Arc;
 
 %1 = 212BI
 Any ideas?

hi nikolaus,

in windows, you're supposed to use %VARIABLE_NAME% for variable substitution. 
try using %1% instead of %1.

-- 
Mauro Tortonesi [EMAIL PROTECTED]

Re: wget bug?

2007-07-09 Thread Matthias Vill


Mauro Tortonesi schrieb:

On Mon, 9 Jul 2007 15:06:52 +1200
[EMAIL PROTECTED] wrote:


wget under win2000/win XP
I get No such file or directory error messages when using the follwing 
command line.


wget -s --save-headers 
http://www.nndc.bnl.gov/ensdf/browseds.jsp?nuc=%1class=Arc;

%1 = 212BI
Any ideas?


hi nikolaus,

in windows, you're supposed to use %VARIABLE_NAME% for variable substitution. 
try using %1% instead of %1.



AFAIK it's ok to use %1, because it is a special case. Also the error 
would be a 404 or some wget error in that case the variable gets 
substituted in a wrong way or not? (actually even than you get a 200 
response with that url)


I just tried using the command inside a batch-file and came across 
another problem: You used a lowercase -s wich is not recognized by my 
wget-version, but a uppercase -S is. i guess you should change that.


I would guess wget is not in your PATH.
Try using c:\path\to\the dircetory\wget.exe instead of just wget.

If this too does not hel at explicit --restrict-file-names=windows to 
your options, so wget does not try to use the ? inside a filename. 
(normally not needed)


So a should-work-for-all-means-version is

c:\path\wget.exe -S --save-headers --restrict-file-names=windows 
http://www.nndc.bnl.gov/ensdf/browseds.jsp?nuc=%1class=Arc;


Of course just one line, but my dump mail-editor wrapped it.

Greetings
Matthias

wget bug?

2007-07-08 Thread Nikolaus_Hermanspahn

wget under win2000/win XP
I get No such file or directory error messages when using the follwing 
command line.

wget -s --save-headers 
http://www.nndc.bnl.gov/ensdf/browseds.jsp?nuc=%1class=Arc;

%1 = 212BI
Any ideas?

thank you

Dr Nikolaus Hermanspahn
Advisor (Science)
National Radiation Laboratory
Ministry of Health
DDI: +64 3 366 5059
Fax: +64 3 366 1156

http://www.nrl.moh.govt.nz
mailto:[EMAIL PROTECTED]




Statement of confidentiality: This e-mail message and any accompanying
attachments may contain information that is IN-CONFIDENCE and subject to
legal privilege.
If you are not the intended recipient, do not read, use, disseminate,
distribute or copy this message or attachments.
If you have received this message in error, please notify the sender
immediately and delete this message.


*
This e-mail message has been scanned for Viruses and Content and cleared 
by the Ministry of Health's Content and Virus Filtering Gateway
*

RE: wget bug

2007-05-24 Thread Tony Lewis

Highlord Ares wrote:

 

 it tries to download web pages named similar to

  http://site.com?variable=yesmode=awesome
http://site.com?variable=yesmode=awesome

 

Since  is a reserved character in many command shells, you need to quote
the URL on the command line:

 

wget  http://site.com?variable=yesmode=awesome
http://site.com?variable=yesmode=awesome;

 

Tony

wget bug

2007-05-23 Thread Highlord Ares


when I run wget on a certain sites, it tries to download web pages named
similar to http://site.com?variable=yesmode=awesome.  However, wget isn't
saving any of these files, no doubt because of some file naming issue?  this
problem exists in both the Windows  unix versions.

hope this helps

RE: wget bug

2007-05-23 Thread Willener, Pat

This does not look like a valid URL to me - shouldn't there be a slash at the 
end of the domain name?
 
Also, when talking about a bug (or anything else), it is always helpful if you 
specify the wget version (number).



From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Highlord Ares
Sent: Thursday, May 24, 2007 11:41
To: [EMAIL PROTECTED]
Subject: wget bug


when I run wget on a certain sites, it tries to download web pages named 
similar to http://site.com?variable=yesmode=awesome.  However, wget isn't 
saving any of these files, no doubt because of some file naming issue?  this 
problem exists in both the Windows  unix versions. 

hope this helps

WGet Bug: Local URLs containing colons do not work

2006-12-10 Thread Peter Fletcher


Hi,

I am trying to download a Wiki category for off-line browsing,
and am using a command-line like this:

wget http://wiki/Category:Fish -r -l 1 -k

Wiki categories contain colons in their filenames, for example:

Category:Fish

If I request that wget convert absolute paths to relative links, then
it will create a link like this:

a href=Category:Fish title=Category:FishFish/a

Unfortunately, this is not a valid URL, because the browser
interprets the 'Category:' as the invalid protocol
Category, not the local filename 'Category:Fish'

You can get wget to replace the : with an escaped character
using --restrict-file-names=windows, but unfortunately this
does not fix the problem because the browser will un-escape
the URL and will still continue to look for a file with a colon
in it.

I am not sure of the best way to address this bug, because I
am not sure if it possible to escape the ':' to prevent the
browser from treating it as a delimiter.

It might be best to be allowed to specify some other character,
such as '_', to be used to replace the ':' in both filename and URL.

Regards,

Peter Fletcher

WGet Bug: Local URLs containing colons do not work

2006-12-10 Thread Peter Fletcher


Hi,

I am trying to download a Wiki category for off-line browsing,
and am using a command-line like this:

wget http://wiki/Category:Fish -r -l 1 -k

Wiki categories contain colons in their filenames, for example:

Category:Fish

If I request that wget convert absolute paths to relative links, then
it will create a link like this:

a href=Category:Fish title=Category:FishFish/a

Unfortunately, this is not a valid URL, because the browser
interprets the 'Category:' as the protocol Category, not
the local filename 'Category:'

I am not sure of the best way to address this bug, because I
am not sure if it possible to escape the ':' to prevent the
browser from treating it as a delimiter.

It might be best to be allowed to specify some character to be
used to replace the ':' in both filename and URL.

Regards,

Peter Fletcher

Re: wget bug in finding files after disconnect

2006-11-18 Thread Georg Schulte Althoff

Paul Bickerstaff [EMAIL PROTECTED] wrote in 
news:[EMAIL PROTECTED]:

 I'm using wget version GNU Wget 1.10.2 (Red Hat modified) on a fedora
 core5 x86_64 system (standard wget rpm). I'm also using version 1.10.2b
 on a WinXP laptop. Both display the same faulty behaviour which I don't
 believe was present in earlier versions of wget that I've used.
 
 When the internet connection disconnects wget automatically tries to
 redownload the file (starting from where it was disconnected).
 
 The problem is that it is consistently failing to find the file. The
 following output shows what is happening.
 
 wget -c ftp://bio-mirror.jp.apan.net/pub/biomirror/blast/nr.*.tar.gz
[...]
 Retrying.
 
 --14:13:54--
 ftp://bio-mirror.jp.apan.net/pub/biomirror/blast/nr.00.tar.gz
   (try: 2) = `nr.00.tar.gz'
 Connecting to bio-mirror.jp.apan.net|150.26.2.58|:21... connected.
 Logging in as anonymous ... Logged in!
 == SYST ... done.== PWD ... done.
 == TYPE I ... done.  == CWD not required.
 == PASV ... done.== REST 315859600 ... done.
 == RETR nr.00.tar.gz ...
 No such file `nr.00.tar.gz'.
 
[...]
 
 I have checked and the files are there and have not moved or altered in
 any way.
 
 I believe that the problem is almost certainly associated with the
 logged item CWD not required after a reconnect.
 
 Cheers

I encountered the same situation and solved it this way:
Call wget with -B (--base) option to set base directory
and with -i (--input-file) to point to a file containing
the relative URLs you want to download.

Not tested, but it should look like this
  wget 
-c 
--base=ftp://bio-mirror.jp.apan.net/pub/biomirror/blast/
--input-file=urls.txt
with urls.txt containing
  nr.*.tar.gz

Hope it helps you.

Georg

wget bug

2006-11-01 Thread lord maximus

well this really isn't a bug per say... but whenever you set -q for no output , it still makes a wget log file on the desktop.

Re: new wget bug when doing incremental backup of very large site

2006-10-21 Thread Steven M. Schweda

From dev:

 I checked and the .wgetrc file has continue=on. Is there any way to
 surpress the sending of getting by byte range? I will read through the
 email and see if I can gather some more information that may be needed.

   Remove continue=on from .wgetrc?

   Consider:

  -N,  --timestampingdon't re-retrieve files unless newer than
 local.



   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547

new wget bug when doing incremental backup of very large site

2006-10-15 Thread dev

I was running wget to test mirroring an internal development site, and 
using large database dumps (binary format) as part of the content to 
provide me with a large number of binary files for the test.  For the 
test I wanted to see if wget would run and download a quantity of 500K 
files with 100GB of total data transferred.


The test was going fine and wget ran flawlessly for 3 days downloading 
almost the entire contents of the test site and I was at 85GB.  wget 
would have run until the very end and would have passed the test 
downloading all 100GB of the test files.


Then a power outage occurred, my local test box was not on battery 
backup, so I had to restart wget and the test.  wget did not refetch the 
binary backup files and gave (for each file that had already been 
retrieved the following message:


-
   = `domain/database/dbdump_107899.gz'
Connecting to domain|ip|:80... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

   The file is already fully retrieved; nothing to do.
-

---
wget continued to run for about eight hours, and gave the above message 
on several thousands files, then crashed giving:

   wget: realloc: Failed to allocate 536870912 bytes; memory exhausted.


This was surprising because wget ran flawlessly on the initial download 
for several days but on a refresh or incremental backup of the data, 
wget crashed after eight hours.   

I believe it has something to do with the code that is run when wget 
already finds a local file with the same name and sends a range 
request.  Maybe there is some data structure that keeps getting added to 
so it exhausts the memory on my test box which has 2GB.  There were no 
other programs running on the test box.


This may be a bug.  To get around this for purposes of my test, I would 
like to know if there is anyway (any switch) to tell wget to not send 
any type of range request at all, if the local filename exists but to 
skip sending any type of request, if it finds a file with the same 
name.  I do not want it to check to see if the file is newer, if the 
file is complete, just skip it and go on to the next file.



I was running wget under cygwin on a Windows XP box.

The wget command that I ran was the following:
   wget -m -l inf --convert-links --page-requisites http://domain

I had the following .wgetrc file
$HOME/.wgetrc
#backup_converted=on
page_requisites=on
continue=on
dirstruct=on
#mirror=on
#noclobber=on
#recursive=on
wait=3
http_user=username
http_passwd=passwd
#convert_links=on
verbose=on
user_agent=firefox
dot_style=binary

new wget bug when doing incremental backup of very large site

2006-10-15 Thread dev

I was running wget to test mirroring an internal development site, and 
using large database dumps (binary format) as part of the content to 
provide me with a large number of binary files for the test.  For the 
test I wanted to see if wget would run and download a quantity of 500K 
files with 100GB of total data transferred.


The test was going fine and wget ran flawlessly for 3 days downloading 
almost the entire contents of the test site and I was at 85GB.  wget 
would have run until the very end and would have passed the test 
downloading all 100GB of the test files.


Then a power outage occurred, my local test box was not on battery 
backup, so I had to restart wget and the test.  wget did not refetch the 
binary backup files and gave (for each file that had already been 
retrieved the following message:


-
  = `domain/database/dbdump_107899.gz'
Connecting to domain|ip|:80... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

  The file is already fully retrieved; nothing to do.
-

---
wget continued to run for about eight hours, and gave the above message 
on several thousands files, then crashed giving:

  wget: realloc: Failed to allocate 536870912 bytes; memory exhausted.


This was surprising because wget ran flawlessly on the initial download 
for several days but on a refresh or incremental backup of the data, 
wget crashed after eight hours.  
I believe it has something to do with the code that is run when wget 
already finds a local file with the same name and sends a range 
request.  Maybe there is some data structure that keeps getting added to 
so it exhausts the memory on my test box which has 2GB.  There were no 
other programs running on the test box.


This may be a bug.  To get around this for purposes of my test, I would 
like to know if there is anyway (any switch) to tell wget to not send 
any type of range request at all, if the local filename exists but to 
skip sending any type of request, if it finds a file with the same 
name.  I do not want it to check to see if the file is newer, if the 
file is complete, just skip it and go on to the next file.



I was running wget under cygwin on a Windows XP box.

The wget command that I ran was the following:
  wget -m -l inf --convert-links --page-requisites http://domain

I had the following .wgetrc file
$HOME/.wgetrc
#backup_converted=on
page_requisites=on
continue=on
dirstruct=on
#mirror=on
#noclobber=on
#recursive=on
wait=3
http_user=username
http_passwd=passwd
#convert_links=on
verbose=on
user_agent=firefox
dot_style=binary

new wget bug when doing incremental backup of very large site

2006-10-15 Thread dev

I was running wget to test mirroring an internal development site, and 
using large database dumps (binary format) as part of the content to 
provide me with a large number of binary files for the test.  For the 
test I wanted to see if wget would run and download a quantity of 500K 
files with 100GB of total data transferred.


The test was going fine and wget ran flawlessly for 3 days downloading 
almost the entire contents of the test site and I was at 85GB.  wget 
would have run until the very end and would have passed the test 
downloading all 100GB of the test files.


Then a power outage occurred, my local test box was not on battery 
backup, so I had to restart wget and the test.  wget did not refetch the 
binary backup files and gave (for each file that had already been 
retrieved the following message:


-
  = `domain/database/dbdump_107899.gz'
Connecting to domain|ip|:80... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

  The file is already fully retrieved; nothing to do.
-

---
wget continued to run for about eight hours, and gave the above message 
on several thousands files, then crashed giving:

  wget: realloc: Failed to allocate 536870912 bytes; memory exhausted.


This was surprising because wget ran flawlessly on the initial download 
for several days but on a refresh or incremental backup of the data, 
wget crashed after eight hours.  
I believe it has something to do with the code that is run when wget 
already finds a local file with the same name and sends a range 
request.  Maybe there is some data structure that keeps getting added to 
so it exhausts the memory on my test box which has 2GB.  There were no 
other programs running on the test box.


This may be a bug.  To get around this for purposes of my test, I would 
like to know if there is anyway (any switch) to tell wget to not send 
any type of range request at all, if the local filename exists but to 
skip sending any type of request, if it finds a file with the same 
name.  I do not want it to check to see if the file is newer, if the 
file is complete, just skip it and go on to the next file.



I was running wget under cygwin on a Windows XP box.

The wget command that I ran was the following:
  wget -m -l inf --convert-links --page-requisites http://domain

I had the following .wgetrc file
$HOME/.wgetrc
#backup_converted=on
page_requisites=on
continue=on
dirstruct=on
#mirror=on
#noclobber=on
#recursive=on
wait=3
http_user=username
http_passwd=passwd
#convert_links=on
verbose=on
user_agent=firefox
dot_style=binary

Re: new wget bug when doing incremental backup of very large site

2006-10-15 Thread Steven M. Schweda

   1. It would help to know the wget version (wget -V).

   2. It might help to see some output when you add -d to the wget
command line.  (One existing file should be enough.)  It's not
immediately clear whose fault the 416 error is.  It might also help to
know which Web server is running on the server, and how big the file is
which you're trying to re-fetch.

 This was surprising [...]

   You're easily surprised.

 wget: realloc: Failed to allocate 536870912 bytes; memory exhausted.

   500MB sounds to me like a lot.

 [...] it exhausts the memory on my test box which has 2GB.

   A memory exhausted complaint here probably refers to virtual
memory, not physical memory.

 [...] I do not want it to check to see if the file is
 newer, if the file is complete, just skip it and go on to the next
 file.

   I haven't checked the code, but with continue=on, I'd expect wget
to check the size and date together, and not download any real data if
the size checks, and the local file date is later.  The 416 error
suggests that it's trying to do a partial (byte-range) download, and is
failing because either it's sending a bad byte range, or the server is
misinterpreting a good byte range.  Adding -d should show what wget
thinks that it's sending.  Knowing that and the actual file size might
show a problem.

   If the -d output looks reasonable, the fault may lie with the
server, and an actual URL may be needed to persue the diagnosis from
there.

   The memory allocation failure could be a bug, but finding it could be
difficult.



   Steven M. Schweda   [EMAIL PROTECTED]
   382 South Warwick Street(+1) 651-699-9818
   Saint Paul  MN  55105-2547

[WGET BUG] - Can not retreive image from cacti

2006-06-19 Thread Thomas GRIMONET

Hello,

We are using version 1.10.2 of wget under Ubuntu and Debian. So we have 
many 
scripts that get some images from a cacti site. These scripts ran perfectly 
with version 1.9 of wget but they can not get image with version 1.10.2 of 
wget.

Here you can find an example of our scripts:

sub GetCactiGraph()
{
my ($node,$alt,$time,$filename)[EMAIL PROTECTED];

my $url = https://foo.bar/cacti/;;
my $b = WWW::Mechanize-new();

$b-get($url);
$b-field(login_username, user);
$b-field(login_password, user);
$b-click();

if ($b-content() =~ /, gFld\(.*$node, (.+)\)\)/g)
{
$b-get($url . $1);
if ($b-content() =~ /img 
src='(graph_image\.php\?local_graph_id=\d+).+' 
border='0' alt='\s*$alt\s*'/g)
{
my $period = ($time eq day ? rra_id=1 : 
rra_id=3);
print WGET: $url$1$period -O $filename\n;
if (defined $filename)
{ `wget -q $url$1$period -O $filename`; 
return $filename;}
else 
{ `wget --no-check-certificate -q 
$url$1$period -O $alt.png`; 
return $alt.png ;}
}
}
}

File is created but it is empty.


Bye,

Thomas

Re: Wget Bug: recursive get from ftp with a port in the url fails

2006-04-13 Thread Hrvoje Niksic

Jesse Cantara [EMAIL PROTECTED] writes:

 A quick resolution to the problem is to use the -nH command line
 argument, so that wget doesn't attempt to create that particular
 directory. It appears as if the problem is with the creation of a
 directory with a ':' in the name, which I cannot do outside of wget
 either. I am not sure if that is specific to my filesystem, or to
 linux in general.

It's not specific to Linux, so it must be your file system.  Are you
perhaps running Wget on a FAT32-mounted partition?  If so, try using
--restrict-file-names=windows.

Thanks for the report.

Wget Bug: recursive get from ftp with a port in the url fails

2006-04-12 Thread Jesse Cantara

I've encountered a bug when trying to do a recursive get from an ftp site with a non-standard port defined in the url, such as ftp.somesite.com:1234.An example of the command I am typing is:
wget -r ftp://user:[EMAIL PROTECTED]:4321/Directory/*Where Directory contains multiple subdirectories, all of which I wish to get.
The output I get from wget is:== SYST ... done. == PWD ... done.== TYPE I ... done. == CWD /Bis ... done.== PASV ... done. == LIST ... done.
ftp.somehost.com:4321/Directory: No such file or directoryftp.somehost.com:4321/Directory/.listing: No such file or directoryunlink: No such file or directory
And nothing is downloaded, wget stops executing there. A quick resolution to the problem is to use the -nH command line argument, so that wget doesn't attempt to create that particular directory. It appears as if the problem is with the creation of a directory with a ':' in the name, which I cannot do outside of wget either. I am not sure if that is specific to my filesystem, or to linux in general. 
I am using GNU Wget 1.10.2 in Linux version 2.6.14, Gentoo 3.3.6.Apologies if this is already known, or if I have not provided enough information. I looked for a bug listing, and attempted to get as much information as I can, but I am not a computer scientist or a programmer. 
Thank you very much for the wonderful program, it has helped me out in many ways, and I hope this helps the developers. -Jesse Cantara

wget bug: doesn't CWD after ftp failure

2006-03-05 Thread Nate Eldredge


Hi folks,

I think I have found a bug in wget where it fails to change the working 
directory when retrying a failed ftp transaction.  This is wget 1.10.2 on 
FreeBSD-6.0/amd64.


I was trying to use wget to get files from a broken ftp server which 
occasionally sends garbled responses, causing wget to get confused, 
eventually timeout, and retry the transfer.  (The failure mode which makes 
it most obvious is sending a response to PASV which lacks the initial 
numeric response code, so that wget can't recognize it.)  This is fine. 
However, when wget reconnects, it mistakenly thinks it is already in the 
appropriate directory, and it doesn't change it, reporting CWD not 
required.  This results in it trying to fetch the file from the root 
directory instead of the correct path.


Unfortunately I can't give you access to the server in question.  I can 
sanitize the output of a wget session if you want.  However, I think the 
bug is obvious from inspection.  At ftp.c:1197 in ftp_loop_internal() we 
have


  err = getftp (u, len, restval, con);

  if (con-csock != -1)
con-st = ~DONE_CWD;
  else
con-st |= DONE_CWD;

This test seems clearly to be backwards.  If con-csock is -1 (i.e. the 
connection has been closed) then we must clear the DONE_CWD flag. 
Otherwise CWD has been done and we can set the flag.


Reversing the test fixes the problem.  It also causes the CWD optimization 
to actually work when it's applicable, instead of only when it isn't :)


It might be worthwhile at other spots in the code to put in an assert() to 
ensure that we have (DO_CWD || !DO_LOGIN).  Perhaps after those flags are 
set, e.g. ftp.c:1161 in ftp_loop_internal() and ftp.c:1409 in 
ftp_retrieve_list().  Also the existence of both DONE_CWD and DO_CWD may 
cause confusion and could probably cleaned up.


Thanks for working on wget!  It's a great tool.

--
Nate Eldredge
[EMAIL PROTECTED]

Re: wget BUG: ftp file retrieval

2005-11-26 Thread Hrvoje Niksic

[EMAIL PROTECTED] (Steven M. Schweda) writes:

 and adding it fixed many problems with FTP servers that log you in
 a non-/ working directory.

 Which of those problems would _not_ be fixed by my two-step CWD for
 a relative path?  That is: [...]

That should work too.  On Unix-like FTP servers, the two methods would
be equivalent.

Thanks for the suggestion.  I realized your patch contained
improvements for dealing with VMS FTP servers, but I somehow managed
to miss this explanation.

Re: wget BUG: ftp file retrieval

2005-11-26 Thread Steven M. Schweda

From: Hrvoje Niksic

 [...]  On Unix-like FTP servers, the two methods would
 be equivalent.

   Right.  So I resisted temptation, and kept the two-step CWD method in
my code for only a VMS FTP server.  My hope was that some one would look
at the method, say That's a good idea, and change the if to let it
be used everywhere.

   Of course, I'm well known to be delusional in these matters.



   Steven M. Schweda   (+1) 651-699-9818
   382 South Warwick Street[EMAIL PROTECTED]
   Saint Paul  MN  55105-2547

wget BUG: ftp file retrieval

2005-11-25 Thread Arne Caspari


Hello,

current wget seems to have the following bug in the ftp retrieval code:

When called like:
wget user:[EMAIL PROTECTED]/foo/bar/file.tgz

and foo or bar is a read/execute protected directory while file.tgz is 
user-readable, wget fails to retrieve the file because it tries to CWD 
into the directory first.


I think the correct behaviour should be not to CWD into the directory 
but to issue a GET request with the full path instead ( which will 
succeed ).


Best regards,

Arne Caspari

Re: wget BUG: ftp file retrieval

2005-11-25 Thread Hrvoje Niksic

Arne Caspari [EMAIL PROTECTED] writes:

 When called like:
 wget user:[EMAIL PROTECTED]/foo/bar/file.tgz

 and foo or bar is a read/execute protected directory while file.tgz is
 user-readable, wget fails to retrieve the file because it tries to CWD
 into the directory first.

 I think the correct behaviour should be not to CWD into the
 directory but to issue a GET request with the full path instead (
 which will succeed ).

I believe that CWD is mandated by the FTP specification, but you're
also right that Wget should try both variants.  You can force Wget
into getting the file without CWD using this kludge:

wget ftp://user:[EMAIL PROTECTED]/%2Ffoo%2Fbar%2Ffile.tgz -O file.tgz

Re: wget BUG: ftp file retrieval

2005-11-25 Thread Mauro Tortonesi


Hrvoje Niksic wrote:

Arne Caspari [EMAIL PROTECTED] writes:

I believe that CWD is mandated by the FTP specification, but you're
also right that Wget should try both variants.


i agree. perhaps when retrieving file A/B/F.X we should try to use:

GET A/B/F.X

first, then:

CWD A/B
GET F.X

if the previous attempt failed, and:

CWD A
CDW B
GET F.X

as a last resort. what do you think?

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it

Re: wget BUG: ftp file retrieval

2005-11-25 Thread Arne Caspari

Thank you all for your very fast response. As a further note: When this 
error occurs, wget bails out with the following error message:

No such directory foo/bar.

I think it should instead be Could not access foo/bar: Permission 
denied or similar in such a situation.


/Arne


Mauro Tortonesi wrote:


Hrvoje Niksic wrote:


Arne Caspari [EMAIL PROTECTED] writes:

I believe that CWD is mandated by the FTP specification, but you're
also right that Wget should try both variants.



i agree. perhaps when retrieving file A/B/F.X we should try to use:

GET A/B/F.X

first, then:

CWD A/B
GET F.X

if the previous attempt failed, and:

CWD A
CDW B
GET F.X

as a last resort. what do you think?

Re: wget BUG: ftp file retrieval

2005-11-25 Thread Hrvoje Niksic

Mauro Tortonesi [EMAIL PROTECTED] writes:

 Hrvoje Niksic wrote:
 Arne Caspari [EMAIL PROTECTED] writes:

 I believe that CWD is mandated by the FTP specification, but you're
 also right that Wget should try both variants.

 i agree. perhaps when retrieving file A/B/F.X we should try to use:

 GET A/B/F.X

 first, then:

 CWD A/B
 GET F.X

 if the previous attempt failed, and:

 CWD A
 CDW B
 GET F.X

 as a last resort. what do you think?

That might work.  Also don't prepend the necessary prepending of $CWD
to those paths.

Re: wget BUG: ftp file retrieval

2005-11-25 Thread Hrvoje Niksic

Hrvoje Niksic [EMAIL PROTECTED] writes:

 That might work.  Also don't prepend the necessary prepending of $CWD
 to those paths.

Oops, I meant don't forget to prepend

Re: wget BUG: ftp file retrieval

2005-11-25 Thread Steven M. Schweda

From: Hrvoje Niksic

 Also don't [forget to] prepend the necessary [...] $CWD
 to those paths.

   Or, better yet, _DO_ forget to prepend the trouble-causing $CWD to
those paths.

   As you might recall from my changes for VMS FTP servers (if you had
ever looked at them), this scheme causes no end of trouble.  A typical
VMS FTP server reports the CWD in VMS form (for example,
SYS$SYSDEVICE:[ANONYMOUS]).  It may be willing to use a UNIX-like path
in a CWD command (for example, CWD A/B, but it's _not_ willing to use
a mix of them (for example, SYS$SYSDEVICE:[ANONYMOUS]/A/B).

   At a minimum, a separate CWD should be used to restore the initial
directory.  After that, you can do what you wish.  On my server at least
(HP TCPIP V5.4), GET A/B/F.X will work, but the mixed mess is unlikely
to work on any VMS FTP server.



   Steven M. Schweda   (+1) 651-699-9818
   382 South Warwick Street[EMAIL PROTECTED]
   Saint Paul  MN  55105-2547

Re: wget BUG: ftp file retrieval

2005-11-25 Thread Daniel Stenberg


On Fri, 25 Nov 2005, Steven M. Schweda wrote:

  Or, better yet, _DO_ forget to prepend the trouble-causing $CWD to those 
paths.


I agree. What good would prepending do? It will most definately add problems 
such as those Steven describes.


--
 -=- Daniel Stenberg -=- http://daniel.haxx.se -=-
  ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol

Re: wget BUG: ftp file retrieval

2005-11-25 Thread Steven M. Schweda

From: Hrvoje Niksic

 Prepending is already there,

   Yes, it certainly is, which is why I had to disable it in my code for
VMS FTP servers.

  and adding it fixed many problems with
 FTP servers that log you in a non-/ working directory.

   Which of those problems would _not_ be fixed by my two-step CWD for a
relative path?  That is:

  1. CWD to the string which the server reported in its initial PWD
 response.

  2. CWD to the relative path in the URL (A/B in our current
 example).

On a VMS server, the first path is probably pure VMS, so it works, and
the second path is pure UNIX, so it also works (on all the servers I've
tried, at least).  As I remark in the (seldom-if-ever-read) comments in
my src/ftp.c, I see no reason why this scheme would fail on any
reasonable server.  But I'm always open to a good argument, especially
if it includes a demonstration of a good counter-example.

   This (in my opinion, stinking-bad) prepending code is the worst part
of what makes the current (not-mine) VMS FTP server code so awful. 
(Running a close second is the part which discards the device name from
the initial PWD response, which led to a user complaint in this forum a
while back, involving an inability to specify a different device in a
URL.)



   Steven M. Schweda   (+1) 651-699-9818
   382 South Warwick Street[EMAIL PROTECTED]
   Saint Paul  MN  55105-2547

wget bug

2005-10-03 Thread Michael C. Haller

Begin forwarded message:

From: [EMAIL PROTECTED]
Date: October 4, 2005 4:36:09 AM GMT+02:00
To: [EMAIL PROTECTED]
Subject: failure notice

Hi. This is the qmail-send program at sunsite.dk.
I'm afraid I wasn't able to deliver your message to the following  
addresses.

This is a permanent error; I've given up. Sorry it didn't work out.

wget@sunsite.dk:
No delivery confirmation received.

--- Below this line is a copy of the message.

Return-Path: [EMAIL PROTECTED]
Received: (qmail 5486 invoked from network); 27 Sep 2005 01:36:08  
-

Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88)
  by sunsite.dk with SMTP; 27 Sep 2005 01:36:08 -
Received: (qmail 70219 invoked from network); 27 Sep 2005 01:36:08  
-

X-Spam-Checker-Version: SpamAssassin 3.1.0 on a.mx.sunsite.dk
X-Spam-Level:
X-Spam-Status: No, score=-1.7 required=6.0  
tests=BAYES_00,UNPARSEABLE_RELAY,

URI_NOVOWEL autolearn=no version=3.1.0
X-Spam-Hits: -1.7
Received: from fencepost.gnu.org (199.232.76.164)
  by a.mx.sunsite.dk with SMTP; 27 Sep 2005 01:36:03 -
Received: from monty-python.gnu.org ([199.232.76.173])
by fencepost.gnu.org with esmtp (Exim 4.34)
id 1EK4O2-00074C-Et
for [EMAIL PROTECTED]; Mon, 26 Sep 2005 21:36:02 -0400
Received: from Debian-exim by monty-python.gnu.org with spam- 
scanned (Exim 4.34)

id 1EK4O1-0002JZ-Ao
for [EMAIL PROTECTED]; Mon, 26 Sep 2005 21:36:01 -0400
Received: from [84.153.95.252] (helo=mail.cilly.mine.nu)
by monty-python.gnu.org with esmtp  
(TLS-1.0:DHE_RSA_3DES_EDE_CBC_SHA:24)

(Exim 4.34)
id 1EK4O0-0002J9-Cd
for [EMAIL PROTECTED]; Mon, 26 Sep 2005 21:36:01 -0400
Received: from [172.16.17.6] (mercury.cilly.mine.nu [172.16.17.6])
(using TLSv1 with cipher RC4-SHA (128/128 bits))
(No client certificate requested)
by mail.cilly.mine.nu (Postfix) with ESMTP id 1A7511AE81E
for [EMAIL PROTECTED]; Tue, 27 Sep 2005 03:35:56 +0200 (CEST)
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed
Message-Id: [EMAIL PROTECTED]
Content-Transfer-Encoding: quoted-printable
X-Mailer: Mail Agent 1.0
From: Michael C. Haller [EMAIL PROTECTED]
Subject: wget does not encode UTF-8 properly
Date: Tue, 27 Sep 2005 03:35:54 +0200
To: [EMAIL PROTECTED]

wget does not encode UTF-8 properly

wget compiled on Mac OS X Tiger 10.4.2 build 8C46:

wget --version
GNU Wget 1.10.1

Copyright (C) 2005 Free Software Foundation, Inc.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

Originally written by Hrvoje Niksic [EMAIL PROTECTED].

#

--03:24:23--  http://x.dyndns.org/~x/Musik1/Faun/Zauberspru% 
cc=20=

%88che/
=3D `x.dyndns.org/~x/Musik1/Faun/Zauberspru=C3% 
88che/=

=20
index.html'
Resolving x.dyndns.org... 84.130.231.75
Connecting to x.dyndns.org|84.130.231.75|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
x.dyndns.org/~x/Musik1/Faun/Zauberspru=C3%88che: Invalid =20
argumentx.dyndns.org/~x/Musik1/Faun/Zauberspru=C3%88che/=20
index.html: No such file or directory

Cannot write to `x.dyndns.org/~x/Musik1/Faun/Zauberspru=C3%=20
88che/index.html' (No such file or directory).

FINISHED --03:24:29--
Downloaded: 0 bytes in 0 files
--03:24:29--  http://x.dyndns.org/~x/Musik1/Apocalyptica/=20
Apocalyptica/
=3D `x.dyndns.org/~x/Musik1/Apocalyptica/=20
Apocalyptica/index.html'
Resolving x.dyndns.org... 84.130.231.75
Connecting to x.dyndns.org|84.130.231.75|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]

 0K .--03:26:42--  http://x.dyndns.org/~x/Musik1/Faun/=20
Zauberspru%cc%88che/
=3D `x.dyndns.org/~x/Musik1/Faun/Zauberspru=C3% 
88che/=

=20
index.html'
Resolving x.dyndns.org... 84.130.231.75
Connecting to x.dyndns.org|84.130.231.75|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
x.dyndns.org/~x/Musik1/Faun/Zauberspru=C3%88che: Invalid =20
argumentx.dyndns.org/~x/Musik1/Faun/Zauberspru=C3%88che/=20
index.html: No such file or directory

Cannot write to `x.dyndns.org/~x/Musik1/Faun/Zauberspru=C3%=20
88che/index.html' (No such file or directory).

FINISHED --03:26:50--
Downloaded: 0 bytes in 0 files
--03:26:50--  http://x.dyndns.org/~x/Musik1/Apocalyptica/=20
Apocalyptica/
=3D `x.dyndns.org/~x/Musik1/Apocalyptica/=20
Apocalyptica/index.html'
Resolving x.dyndns.org... 84.130.231.75
Connecting to x.dyndns.org|84.130.231.75|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]

wget bug report

2005-06-13 Thread A.Jones

Sorry for the crosspost, but the wget Web site is a little confusing on the 
point of where to send bug reports/patches.

Just installed wget 1.10 on Friday. Over the weekend, my scripts failed with 
the 
following error (once for each wget run):
Assertion failed: wget_cookie_jar != NULL, file http.c, line 1723
Abort - core dumped

All of my command lines are similar to this:
/home/programs/bin/wget -q --no-cache --no-cookies -O /home/programs/etc/alte_se
iten/xsr.html 'http://www.enterasys.com/download/download.cgi?lib=XSR'

After taking a look at it, i implemented the following change to http.c and 
tried again. It works for me, but i don't know what other implications my 
change 
might have.

--- http.c.orig Mon Jun 13 08:04:23 2005
+++ http.c  Mon Jun 13 08:06:59 2005
@@ -1715,6 +1715,7 @@
   hs-remote_time = resp_header_strdup (resp, Last-Modified);
 
   /* Handle (possibly multiple instances of) the Set-Cookie header. */
+  if (opt.cookies)
   {
 char *pth = NULL;
 int scpos;


Mit freundlichen Grüßen

MVV Energie AG
Abteilung AI.C

Andrew Jones

Telefon: +49 621 290-3645
Fax: +49 621 290-2677
E-Mail: [EMAIL PROTECTED] Internet: www.mvv.de
MVV Energie · Luisenring 49 · 68159 Mannheim
Handelsregister-Nr. HRB 1780
Vorsitzender des Aufsichtsrates: Oberbürgermeister Gerhard Widder
Vorstand: Dr. Rudolf Schulten (Vorsitzender) · Dr. Werner Dub · Hans-Jürgen 
Farrenkopf · Karl-Heinz Trautmann

Re: Wget Bug

2005-04-26 Thread Hrvoje Niksic

Arndt Humpert [EMAIL PROTECTED] writes:

 wget, win32 rel. crashes with huge files.

Thanks for the report.  This problem has been fixed in the latest
version, available at http://xoomer.virgilio.it/hherold/ .

Wget Bug

2005-04-26 Thread Arndt Humpert

Hello,

wget, win32 rel. crashes with huge files.

regards
[EMAIL PROTECTED]




___ 
Gesendet von Yahoo! Mail - Jetzt mit 250MB Speicher kostenlos - Hier anmelden: 
http://mail.yahoo.de== Command Line
wget  -m ftp://ftp.freenet.de/pub/filepilot/windows/bildung/wikipedia/
Assert Error while mirroing a big file 

== see ftp listing:
P:\temp\wiki\newftp ftp.freenet.de
Connected to ftp-0.freenet.de.
220 ftp.freenet.de FTP server ready.
User (ftp-0.freenet.de:(none)): anonymous
331 Password required.
Password:
230 Login completed.
ftp cd pub
250 Changed working directory to /pub.
ftp cd filepilot
250 Changed working directory to /pub/filepilot.
ftp cd windows
250 Changed working directory to /pub/filepilot/windows.
ftp cd bildung
250 Changed working directory to /pub/filepilot/windows/bildung.
ftp cd wikipedia
250 Changed working directory to /pub/filepilot/windows/bildung/wikipedia.
ftp dir
200 PORT command ok.
150 Opening data connection.
-rw-r--r--   1 filepilo ftp 61875 Apr 11 13:06 WikiCover.pdf
-rw-r--r--   1 filepilo ftp  344804797 Apr 11 13:20 dbd_76.dbz
-rw-r--r--   1 filepilo ftp425128 Apr 08 13:34 dvdcover_wikipedia.zip
-rw-r--r--   1 filepilo ftp  2752401408 Apr 08 15:30 wp_1_2005.iso
-rw-r--r--   1 filepilo ftp  14407705 Apr 11 13:06 wpcdhtml.zip
-rw-r--r--   1 filepilo ftp  69805003 Apr 11 13:09 wpcdim.zip
-rw-r--r--   1 filepilo ftp  701104128 Apr 11 13:34 wpcdiso.iso
-rw-r--r--   1 filepilo ftp  10758083 Apr 11 13:07 wpcdmath.zip
-rw-r--r--   1 filepilo ftp  121069235 Apr 11 13:12 wpcdxml.zip
226 Transfer complete.
ftp: 632 bytes received in 0,03Seconds 19,75Kbytes/sec.
ftp bye
221 Goodbye.

P:\temp\wiki\new

== Version Info
P:\temp\wiki\newwget -V
GNU Wget 1.9.1

Copyright (C) 2003 Free Software Foundation, Inc.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

Originally written by Hrvoje Niksic [EMAIL PROTECTED].



== Screen Output  Error
--11:00:40--  ftp://ftp.freenet.de/pub/filepilot/windows/bildung/wikipedia/wp_1_
2005.iso
   = `ftp.freenet.de/pub/filepilot/windows/bildung/wikipedia/wp_1_2005.
iso'
== CWD not required.
== PORT ... done.== RETR wp_1_2005.iso ... done.
Length: -1,542,565,888

[ =  ] -1,542,565,888  122.04K/s

Assertion failed: bytes = 0, file retr.c, line 292

abnormal program termination


rcv:[EMAIL PROTECTED]

WGET Bug?

2005-04-04 Thread Nijs, J. de

Title: WGET Bug?






#

C:\Grabtest\wget.exe -r --tries=3 http://www.xs4all.nl/~npo/ -o C:/Grabtest/Results/log

#

--16:23:02-- http://www.xs4all.nl/%7Enpo/

 = `www.xs4all.nl/~npo/index.html'

Resolving www.xs4all.nl... 194.109.6.92

Connecting to www.xs4all.nl[194.109.6.92]:80... failed: No such file or directory.

Retrying.

#


Is WGET always aspecting a INDEX.HTML al url file for grabbing data from the WWW ?


The most URLs we want to grab are not named as index.html but have other names like:


http://www.ecb.int/stats/eurofxref/eurofxref-daily.xml

http://www.ecb.de/stats/exchange/eurofxref/html/index.en.html

http://www.apx.nl/marketresults.html


Is this a problem for WGET by the way?


Kind regardst,

Peter de Nijs
DELTA N.V. afdeling Portfolio Analyse
06-45 57 29 17
06-45 57 29 17


===
Dit e-mailbericht is slechts bedoeld voor gebruik door de geadresseerde.
Dit bericht kan vertrouwelijke informatie bevatten en/of informatie die is beschermd door een beroepsgeheim.
Indien u dit bericht ontvangt terwijl dit niet voor u is bestemd, verzoeken wij u vriendelijk ons hierover
per omgaande te berichten. Bij voorbaat dank!

The information transmitted by e-mail may be privileged or confidential and protected by Law.
If you have received it in error, we would appreciate your notifying us immediately. Thank you!
===

Wget bug

2005-02-02 Thread Vitor Almeida




OS = Solaris 
8
Platform = 
Sparc

Test command = 
/usr/local/bin/wget -r -t0 -m ftp://root:[EMAIL PROTECTED]/usr/openv/var
The directory will 
count to some sub-direcotry's andfiles to 
synchronize.

Example 
:

# ls -la 
/usr/openv/total 68462drwxr-xr-x 14 root 
bin 512 set 1 17:52 
.drwxr-xr-x 18 root 
sys 512 dez 16 17:01 
..drwxr-xr-x 2 root 
bin 512 set 1 17:52 
bindrwxr-xr-x 5 root 
bin 512 set 1 17:44 
dbdrwxr-xr-x 5 root 
bin 1024 set 1 17:53 
javadrwxr-xr-x 4 root 
bin 1536 set 1 17:52 
libdrwxr-xr-x 4 root 
bin 512 set 1 17:46 
mandrwxr-xr-x 3 root 
bin 512 set 1 17:46 
msgdrwxr-xr-x 11 root 
bin 1024 set 2 12:38 
netbackupdrwxr-xr-x 2 root 
other 512 set 1 14:23 
patchdrwxr-xr-x 2 root 
bin 512 set 1 17:47 
sharedrwxr-xr-x 2 root 
bin 512 set 1 17:47 
tmpdrwxr-xr-x 5 root 
bin 512 set 2 09:48 
vardrwxr-xr-x 8 root 
bin 512 set 1 19:16 
volmgr

# ls -laR 
/usr/openv/var/.:total 18drwxr-xr-x 5 
root 
bin 512 set 2 09:48 
.drwxr-xr-x 14 root 
bin 512 set 1 17:52 
..drwxr-xr-x 3 root 
bin 512 set 1 17:52 
auth-rw-r--r-- 1 root 
root 9 set 2 
09:48 authorize.txt-rw-r--r-- 1 root 
other 2956 dez 18 2002 
license.txtdrwx-- 2 root 
other 512 jan 5 20:56 
vnetddrwxr-xr-x 3 root 
bin 512 set 1 17:52 
vxss

./auth:total 
42drwxr-xr-x 3 root 
bin 512 set 1 17:52 
.drwxr-xr-x 5 root 
bin 512 set 2 09:48 
..-rw-r--r-- 1 root 
bin 921 out 3 
2002 methods.txt-rw-r--r-- 1 root 
bin 1415 set 1 12:11 
methods_allow.txt-rw-r--r-- 1 root 
bin 1599 out 1 2002 
methods_deny.txt-rw-r--r-- 1 root 
bin 1459 out 1 2002 
names_allow.txt-rw-r--r-- 1 root 
bin 1701 out 1 2002 
names_deny.txt-r--r--r-- 1 root 
bin 965 set 1 17:52 
template.methods.txt-r--r--r-- 1 root 
bin 1387 set 1 17:52 
template.methods_allow.txt-r--r--r-- 1 
root bin 
1607 set 1 17:52 template.methods_deny.txt-r--r--r-- 1 
root bin 
1467 set 1 17:52 template.names_allow.txt-r--r--r-- 1 
root bin 
1709 set 1 17:52 template.names_deny.txtdrwxr-xr-x 4 
root other 512 
set 1 12:08 vopie

./auth/vopie:total 8drwxr-xr-x 4 
root other 512 
set 1 12:08 .drwxr-xr-x 3 root 
bin 512 set 1 17:52 
..drwx-- 3 root 
other 512 set 1 12:08 
hasheddrwx-- 3 root 
other 512 set 1 12:08 
unhashed


Log of command 
wget:

Downloaded: 184 
bytes in 1 files--18:02:33-- ftp://root:[EMAIL PROTECTED]/usr/openv/var 
= `10.1.1.10/usr/openv/.listing'Connecting to 10.1.1.10:21... 
connected.Logging in as root ... Logged in!== SYST ... 
done. == PWD ... done.== TYPE I ... done. 
== CWD /usr/openv ... done.== PORT ... done. 
== LIST ... done.

 [ 
= 
] 903 
--.--K/s 


18:02:34 (192.12 
KB/s) - `10.1.1.10/usr/openv/.listing' saved [903]

--18:02:34-- 
ftp://root:[EMAIL PROTECTED]/usr/openv/var 
= `10.1.1.10/usr/openv/var'== CWD not required.== PORT ... 
done. == RETR var ... No such file 
`var'.

FINISHED 
--18:02:34--Downloaded: 903 bytes in 1 files

NOTE: The ftp 
command working fine.

Re: wget bug: spaces in directories mapped to %20

2005-01-17 Thread Jochen Roderburg

Zitat von Tony O'Hagan [EMAIL PROTECTED]:

 Original path:  abc def/xyz pqr.gif
 After wget mirroring:   abc%20def/xyz pqr.gif   (broken link)

 wget --version  is GNU Wget 1.8.2


This was a well-known error in the 1.8 versions of wget, which is already
corrected in the 1.9 versions.

Regards,

Jochen Roderburg
ZAIK/RRZK
University of Cologne
Robert-Koch-Str. 10 Tel.:   +49-221/478-7024
D-50931 Koeln   E-Mail: [EMAIL PROTECTED]
Germany

wget bug: spaces in directories mapped to %20

2005-01-16 Thread Tony O'Hagan

Recently I used the following wget command under a hosted linux account:
 $ wget -mirror url -o mirror.log
The web site contained files and virtual directories that contained spaces 
in the names.
URL encoding translated these spaces to %20.

wget correctly URL decoded the file names (creating file names containing 
spaces) but incorrectly failed to URL decode the directory names (creating 
directory paths containing %20 instead of spaces).  The resulting mirror 
therefor contained broken links.  Some hyper links were embedded inside 
flash graphics files so hyper link renaming was not an option.  Personally, 
I would never put a space in a web hosted file or directory name but in this 
case I was migrating a web site that had been developed by someone else.  I 
think that mirroring should work regardless in this case.

Example:
Original path:  abc def/xyz pqr.gif
After wget mirroring:   abc%20def/xyz pqr.gif   (broken link)
wget --version  is GNU Wget 1.8.2
Thanks for the invaluable wget.
Tony O'Hagan.

--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.13 - Release Date: 16/01/2005

wget bug

2005-01-15 Thread Matthew F. Dennis

It seems that wget uses a signed 32 bit value for the content-length in HTTP.  I
haven't looked at the code, but it appears that this is what is happening. 

The problem is that when a file larger than about 2GB is downloaded, wget
reports negative numbers for it's size and quits the download right after it
starts.

I would assume that somewhere there is a loop that looks something like:

while( what I've downloaded  what I think the size is )
{
//do some more downloading.
}

And after the first read from the stream, the loop fails because whatever you
read is indeed bigger than a negative number so it exits.

Of course, this is all speculation on my part about what the code looks like but
none the less, the bug does exist on both linux and cygwin.

Thanks,

Matt

---
BTW:
great job, really...  
on wget and all the GNU software in general...
THANKS

wget bug with large files

2004-12-10 Thread Roberto Sebastiano

I got a crash in wget downloading a large iso file (2,4 GB)


newdeal:/pub/isos# wget -c
ftp://ftp.belnet.be/linux/fedora/linux/core/3/i386/iso/FC3-i386-DVD.iso
--09:22:17--
ftp://ftp.belnet.be/linux/fedora/linux/core/3/i386/iso/FC3-i386-DVD.iso
   = `FC3-i386-DVD.iso'
Resolving ftp.belnet.be... 193.190.198.20
Connecting to ftp.belnet.be[193.190.198.20]:21... connected.
Accesso come utente anonymous ... Login eseguito!
== SYST ... fatto.   == PWD ... fatto.
== TYPE I ... fatto.  == CWD /linux/fedora/linux/core/3/i386/iso ...
fatto.
== SIZE FC3-i386-DVD.iso ... fatto.
== PASV ... fatto.   == REST 2079173504 ... fatto.
== RETR FC3-i386-DVD.iso ... fatto.

100%[+=] 2,147,470,560   60.39K/s
ETA 00:00wget: progress.c:704: create_image: Assertion `insz = dlsz'
failed.
Aborted


then I tried to resume the download ..

newdeal:/pub/isos# wget -c
ftp://ftp.belnet.be/linux/fedora/linux/core/3/i386/iso/FC3-i386-DVD.iso
--09:41:40--
ftp://ftp.belnet.be/linux/fedora/linux/core/3/i386/iso/FC3-i386-DVD.iso
   = `FC3-i386-DVD.iso'
Resolving ftp.belnet.be... 193.190.198.20
Connecting to ftp.belnet.be[193.190.198.20]:21... connected.
Accesso come utente anonymous ... Login eseguito!
== SYST ... fatto.   == PWD ... fatto.
== TYPE I ... fatto.  == CWD /linux/fedora/linux/core/3/i386/iso ...
fatto.
== SIZE FC3-i386-DVD.iso ... fatto.
== PASV ... fatto.   == REST -2147476576 ... 
REST fallito, ricomincio dall'inizio. (restarting from beginning)
== RETR FC3-i386-DVD.iso ... fatto.

[
=] 551,648
63.87K/s


Here it deleted the old iso image (2,1GB downloaded) and started from
the beginning .. shouldn't it save the new file with a .1 suffix ?



Let me know if I can help you tracking this bug


Thanks,
-- 
Roberto Sebastiano [EMAIL PROTECTED]

I want to report a wget bug

2004-11-24 Thread jiaming

Hello!
  I am very pleased to use wget to crawl pages. It is an excellent tool. 
Recently I find a bug in using wget, although I am not sure wether it's a bug 
or an incorrect usage. I just to want to report here.
When I use wget to mirror or recursively download a web site with -O 
option, I mean to mirror the whole site's pages in one file. But as I type 
./wget -m -O filename http://site;, I can only save the index file of site 
into file filename. Surprisingly, when I first type ./wget -m http://site;, 
after successfully download some pages, I stop the crawling process, and this 
pages will be save to a hierachy the same as the website itself. After that, 
when I use -O option again for the same web site, the mirror option will then 
take effect. 
I will be looking forward to hearing from you ,
   Thanks


jiaming
[EMAIL PROTECTED]
2004-11-25

wget -- bug / feature request (not sure)

2004-09-04 Thread Vlad Kudelin

Hello,

Probably I am just too lazy, haven't spent enough time to read the man, and
wget  can actually do exactly what I want.
If so -- I do apologize for taking your time.
Otherwise: THANKS for your time!..:-).

My problem is:
redirects.

I am trying to catch them by using, say, netcat ... or writing some simple
pieces of software -- sending HTTP GET and catching the Location: in
response. What I've found out is that (obviously) wget is wa-ay more
sophisticated and can do much better job, especially in certain cases.

I started using it by basically catching stderr from wget [params my_urls]
and then parsing it -- looking for the ^Location:  pattern.
Works great.
The downside is: performance.
You see, I don't need the actual content, -- only the canonical URL. But
wget just wgets it - no matter what.

As long as (from my perspective) this is a case of If  Wget does not behave
as documented, it's a bug. -- according to man, -- I am taking a liberty to
'file a bug'.

(The expected behavior I'm talking about is this:  if I use
--spider, I expect wget do nothing after finding the server -- like
sending GET to the server and getting HTML back).

That's my bug - and/or a feature I'd really like to have.  An alternative
would be: adding --some_flag=n, meaning receive no more than n lines of
html).

Do you think that this could be a useful feature that other people would
probably love too?...

Thanks for your time and for a great tool,

Vlad.

Re: wget bug with ftp/passive

2004-08-12 Thread Jeff Connelly

On Wed, 21 Jan 2004 23:07:30 -0800, you wrote:
Hello,
I think I've come across a little bug in wget when using it to get a file
via ftp.

I did not specify the passive option, yet it appears to have been used
anyway Here's a short transcript:
Passive FTP can be specified in /etc/wgetrc or /usr/local/etc/wgetrc, and then
its impossible to turn it off. There is no --active-mode flag as far
as I can tell.

I submitted a patch to wget-patches under the title of 
Patch to add --active-ftp and make --passive-ftp default, which does
what it says.
Your configuration is setting passive mode to default, but the stock
wget defaults
to active (active mode doesn't work too well behind some firewalls).
--active-ftp is
a very useful option in these cases.

Last I checked, the patch hasn't been committed. I can't find the wget-patches
mail archives anywhere, either. So I'll paste it here, in hopes that it helps.

-Jeff Connelly

=cut here=
Common subdirectories: doc.orig/ChangeLog-branches and doc/ChangeLog-branches
diff -u doc.orig/wget.pod doc/wget.pod
--- doc.orig/wget.pod   Wed Jul 21 20:17:29 2004
+++ doc/wget.podWed Jul 21 20:18:56 2004
@@ -888,12 +888,17 @@
 system-specific.  This is why it currently works only with Unix FTP
 servers (and the ones emulating Unix Cls output).

+=item B--active-ftp
+
+Use the Iactive FTP retrieval scehme, in which the server
+initiates the data connection. This is sometimes required to connect
+to FTP servers that are behind firewalls.

 =item B--passive-ftp

 Use the Ipassive FTP retrieval scheme, in which the client
 initiates the data connection.  This is sometimes required for FTP
-to work behind firewalls.
+to work behind firewalls, and as such is enabled by default.


 =item B--retr-symlinks
Common subdirectories: src.orig/.libs and src/.libs
Common subdirectories: src.orig/ChangeLog-branches and src/ChangeLog-branches
diff -u src.orig/init.c src/init.c
--- src.orig/init.c Wed Jul 21 20:17:33 2004
+++ src/init.c  Wed Jul 21 20:17:59 2004
@@ -255,6 +255,7 @@
   opt.ftp_glob = 1;
   opt.htmlify = 1;
   opt.http_keep_alive = 1;
+  opt.ftp_pasv = 1;
   opt.use_proxy = 1;
   tmp = getenv (no_proxy);
   if (tmp)
diff -u src.orig/main.c src/main.c
--- src.orig/main.c Wed Jul 21 20:17:33 2004
+++ src/main.c  Wed Jul 21 20:17:59 2004
@@ -217,7 +217,8 @@
 FTP options:\n\
   -nr, --dont-remove-listing   don\'t remove `.listing\' files.\n\
   -g,  --glob=on/off   turn file name globbing on or off.\n\
-   --passive-ftp   use the \passive\ transfer mode.\n\
+   --passive-ftp   use the \passive\ transfer mode (default).\n\
+   --active-ftpuse the \active\ transfer mode.\n\
--retr-symlinks when recursing, get linked-to files (not dirs).\
n\
 \n), stdout);
   fputs (_(\
@@ -285,6 +286,7 @@
 { no-parent, no_argument, NULL, 133 },
 { non-verbose, no_argument, NULL, 146 },
 { passive-ftp, no_argument, NULL, 139 },
+{ active-ftp, no_argument, NULL, 167 },
 { page-requisites, no_argument, NULL, 'p' },
 { quiet, no_argument, NULL, 'q' },
 { random-wait, no_argument, NULL, 165 },
@@ -397,6 +399,9 @@
case 139:
  setval (passiveftp, on);
  break;
+case 167:
+  setval (passiveftp, off);
+  break;
case 141:
  setval (noclobber, on);
  break;

wget bug: directory overwrite

2004-04-05 Thread Juhana Sadeharju

Hello.

Problem: When downloading all in
   http://udn.epicgames.com/Technical/MyFirstHUD
wget overwrites the downloaded MyFirstHUD file with
MyFirstHUD directory (which comes later).

GNU Wget 1.9.1
wget -k --proxy=off -e robots=off --passive-ftp -q -r -l 0 -np -U Mozilla $@

Solution: Use of -E option.

Regards,
Juhana

wget bug report

2004-03-26 Thread Corey Henderson

I sent this message to [EMAIL PROTECTED] as directed in the wget man page, but it 
bounced and said to try this email address.

This bug report is for GNU Wget 1.8.2 tested on both RedHat Linux 7.3 and 9

rpm -q wget
wget-1.8.2-9

When I use a wget with the -S to show the http headers, and I use the spider switch as 
well, it gives me a 501 error on some servers.

The main example I have found was doing it against a server running ntop.

http://www.ntop.org/

You can find an RPM for it at:

http://rpm.pbone.net/index.php3/stat/4/idpl/586625/com/ntop-2.2-0.dag.rh90.i386.rpm.html

You cean search with other parameters at rpm.pbone.net to get ntop for other version 
of linux

So here is the command and output:

wget -S --spider http://SERVER_WITH_NTOP:3000

HTTP request sent, awaiting response...
 1 HTTP/1.0 501 Not Implemented
 2 Date: Sat, 27 Mar 2004 07:08:24 GMT
 3 Cache-Control: no-cache
 4 Expires: 0
 5 Connection: close
 6 Server: ntop/2.2 (Dag Apt RPM Repository) (i686-pc-linux-gnu)
 7 Content-Type: text/html
21:11:56 ERROR 501: Not Implemented.

I get a 501 error. echoing the $? shows an exit status of 1

When I don't use the spider, I get the following:

wget -S http://SERVER_WITH_NTOP:3000

HTTP request sent, awaiting response...
 1 HTTP/1.0 200 OK
 2 Date: Sat, 27 Mar 2004 07:09:31 GMT
 3 Cache-Control: max-age=3600, must-revalidate, public
 4 Connection: close
 5 Server: ntop/2.2 (Dag Apt RPM Repository) (i686-pc-linux-gnu)
 6 Content-Type: text/html
 7 Last-Modified: Mon, 17 Mar 2003 20:27:49 GMT
 8 Accept-Ranges: bytes
 9 Content-Length: 1214

100%[==]
 1,214  1.16M/sETA 00:00

21:13:04 (1.16 MB/s) - `index.html' saved [1214/1214]



The exit status was 0 and the index.html file was downloaded.

If this is a bug please fix it in your next release of wget. If it is not a bug, I 
would appriciate a brief explination as to why.

Thank You

Corey Henderson
Chief Programmer
GlobalHost.com

wget bug in retrieving large files 2 gig

2004-03-09 Thread Eduard Boer

Hi,

While downloading a file of about 3,234,550,172 bytes with wget 
http://foo/foo.mpg; I get an error:

HTTP request sent, awaiting response... 200 OK
Length: unspecified [video/mpeg]
   [  
=   
] -1,060,417,124   13.10M/s

wget: retr.c:292: calc_rate: Assertion `bytes = 0' failed.
Aborted
The md5sum of downloaded and origanal file is de same! So there should 
not be an error.
The amound of 'bytes downloaded' during is not correct also: It become 
negative over 2 gig.

greetings from the Netherlands,
Eduard

Re: wget bug with ftp/passive

2004-01-22 Thread Hrvoje Niksic

don [EMAIL PROTECTED] writes:

 I did not specify the passive option, yet it appears to have been used
 anyway Here's a short transcript:

 [EMAIL PROTECTED] sim390]$ wget ftp://musicm.mcgill.ca/sim390/sim390dm.zip
 --21:05:21--  ftp://musicm.mcgill.ca/sim390/sim390dm.zip
= `sim390dm.zip'
 Resolving musicm.mcgill.ca... done.
 Connecting to musicm.mcgill.ca[132.206.120.4]:21... connected.
 Logging in as anonymous ... Logged in!
 == SYST ... done.== PWD ... done.
 == TYPE I ... done.  == CWD /sim390 ... done.
 == PASV ...
 Cannot initiate PASV transfer.

Are you sure that something else hasn't done it for you?  For example,
a system-wide initialization file `/usr/local/etc/wgetrc' or
`/etc/wgetrc'.

Re: wget bug

2004-01-12 Thread Hrvoje Niksic

Kairos [EMAIL PROTECTED] writes:

 $ cat wget.exe.stackdump
[...]

What were you doing with Wget when it crashed?  Which version of Wget
are you running?  Was it compiled for Cygwin or natively for Windows?

wget bug

2004-01-06 Thread Kairos

$ cat wget.exe.stackdump
Exception: STATUS_ACCESS_VIOLATION at eip=77F51BAA
eax= ebx= ecx=0700 edx=610CFE18 esi=610CFE08 edi=
ebp=0022F7C0 esp=0022F74C program=C:\nonspc\cygwin\bin\wget.exe
cs=001B ds=0023 es=0023 fs=0038 gs= ss=0023
Stack trace:
Frame Function  Args
0022F7C0  77F51BAA  (000CFE08, 6107C8F1, 610CFE08, )
0022FBA8  77F7561D  (1004D9C0, , 0022FC18, 00423EF8)
0022FBB8  00424ED9  (1004D9C0, 0022FBF0, 0001, 0022FBF0)
0022FC18  00423EF8  (1004A340, 002A, 7865646E, 6D74682E)
0022FD38  0041583B  (1004A340, 0022FD7C, 0022FD80, 100662C8)
0022FD98  00420D93  (10066318, 0022FDEC, 0022FDF0, 100662C8)
0022FE18  0041EB7D  (10021A80, 0041E460, 610CFE40, 0041C2F4)
0022FEF0  0041C47B  (0004, 61600B64, 10020330, 0022FF24)
0022FF40  61005018  (610CFEE0, FFFE, 07E4, 610CFE04)
0022FF90  610052ED  (, , 0001, )
0022FFB0  00426D41  (0041B7D0, 037F0009, 0022FFF0, 77E814C7)
0022FFC0  0040103C  (0001, 001D, 7FFDF000, F6213CF0)
0022FFF0  77E814C7  (00401000, , 78746341, 0020)
End of stack trace

Wget Bug

2003-11-10 Thread Kempston

Here is debug output

:/FTPD# wget ftp://ftp.dcn-asu.ru/pub/windows/update/winxp/xpsp2-1224.exe -d
DEBUG output created by Wget 1.8.1 on linux-gnu.

--13:25:55--  ftp://ftp.dcn-asu.ru/pub/windows/update/winxp/xpsp2-1224.exe
   = `xpsp2-1224.exe'
Resolving ftp.dcn-asu.ru... done.
Caching ftp.dcn-asu.ru = 212.192.20.40
Connecting to ftp.dcn-asu.ru[212.192.20.40]:21... connected.
Created socket 3.
Releasing 0x8073398 (new refcount 1).
Logging in as anonymous ... 220 news FTP server ready.

-- USER anonymous
331 Guest login ok, send your complete e-mail address as password.
-- PASS -wget@
530 Login incorrect.

Login incorrect.
Closing fd 3

Server reply is 

--- 530-
--- 530-Sorry! Too many users are logged in.
--- 530-Try letter, please.
--- 530-
--- 530 Login incorrect.
 Server reply matched ftp:retry-530, retrying

But wget won`t even try to retry :(
Can you fix that ?

Re: Wget Bug

2003-11-10 Thread Hrvoje Niksic

The problem is that the server replies with login incorrect, which
normally means that authorization has failed and that further retries
would be pointless.  Other than having a natural language parser
built-in, Wget cannot know that the authorization is in fact correct,
but that the server happens to be busy.

Maybe Wget should have an option to retry even in the case of (what
looks like) a login incorrect FTP response.

Re: Wget Bug

2003-11-10 Thread Hrvoje Niksic

Kempston [EMAIL PROTECTED] writes:

 Yeah, i understabd that, but lftp hadles it fine even without
 specifying any additional option ;)

But then lftp is hammering servers when real unauthorized entry
occurs, no?

 I`m sure you can work something out

Well, I'm satisfied with what Wget does now.  :-)

Re: dificulty with Debian wget bug 137989 patch

2003-09-30 Thread Hrvoje Niksic

jayme [EMAIL PROTECTED] writes:
[...]

Before anything else, note that the patch originally written for 1.8.2
will need change for 1.9.  The change is not hard to make, but it's
still needed.

The patch didn't make it to canonical sources because it assumes `long
long', which is not available on many platforms that Wget supports.
The issue will likely be addressed in 1.10.

Having said that:

 I tried the patch Debian bug report 137989 and didnt work. Can
 anybody explain:
 1 - why I have to make to directories for patch work: one
 wget-1.8.2.orig and one wget-1.8.2 ?

You don't.  Just enter Wget's source and type `patch -p1 patchfile'.
`-p1' makes sure that the top-level directories, such as
wget-1.8.2.orig and wget-1.8.2 are stripped when finding files to
patch.

 2 - why after compilation the wget still cant download the file 
 2GB ?

I suspect you've tried to apply the patch to Wget 1.9-beta, which
doesn't work, as explained above.

dificulty with Debian wget bug 137989 patch

2003-09-29 Thread jayme

I tried the patch Debian bug report 137989 and didnt work. Can anybody explain:
1 - why I have to make to directories for patch work: one wget-1.8.2.orig and one 
wget-1.8.2 ?
2 - why after compilation the wget still cant download the file  2GB ?
note : I cut the patch for debian use ( the first diff ) 
Thank you
Jayme 
[EMAIL PROTECTED]

wget bug

2003-09-26 Thread Jack Pavlovsky

It's probably a bug:
bug: when downloading 
wget -mirror ftp://somehost.org/somepath/3acv14~anivcd.mpg, 
 wget saves it as-is, but when downloading
wget ftp://somehost.org/somepath/3*, wget saves the files as 3acv14%7Eanivcd.mpg

--
The human knowledge belongs to the world

Re: wget bug

2003-09-26 Thread DervishD

Hi Jack :)

 * Jack Pavlovsky [EMAIL PROTECTED] dixit:
 It's probably a bug:
 bug: when downloading 
 wget -mirror ftp://somehost.org/somepath/3acv14~anivcd.mpg, 
  wget saves it as-is, but when downloading
 wget ftp://somehost.org/somepath/3*, wget saves the files as 
 3acv14%7Eanivcd.mpg

Yes, it *was* a bug. The lastest prerelease has it fixed. Don't
know if the tarball has the latest patches, ask Hvroje. But if you
are not in a hurry, just wait for 1.9 to be released.

 The human knowledge belongs to the world

True ;))

Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736
http://www.pleyades.net  http://raul.pleyades.net/

Re: wget bug

2003-09-26 Thread Hrvoje Niksic

Jack Pavlovsky [EMAIL PROTECTED] writes:

 It's probably a bug: bug: when downloading wget -mirror
 ftp://somehost.org/somepath/3acv14~anivcd.mpg, wget saves it as-is,
 but when downloading wget ftp://somehost.org/somepath/3*, wget saves
 the files as 3acv14%7Eanivcd.mpg

Thanks for the report.  The problem here is that Wget tries to be
helpful by encoding unsafe characters in file names to %XX, as is
done in URLs.  Your first example works because of an oversight (!) 
that actually made Wget behave as you expected.

The good news is that the helpfulness has been rethought for the
next release and is no longer there, at least not for ordinary
characters like ~ and  .  Try getting the latest CVS sources, they
should work better in this regard.  (http://wget.sunsite.dk/ explains
how to download the source from CVS.)

wget bug

2002-11-05 Thread Jing Ping Ye



Dear Sir:
I tried to use "wget" download data from ftp site but got error message
as following:
> wget ftp://ftp.ngdc.noaa.gov/pub/incoming/RGON/anc_1m.OCT
Screen show:
--
--09:02:40-- ftp://ftp.ngdc.noaa.gov/pub/incoming/RGON/anc_1m.OCT
 => `anc_1m.OCT'
Resolving ftp.ngdc.noaa.gov... done.
Connecting to ftp.ngdc.noaa.gov[140.172.180.164]:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done. ==> PWD ... done.
==> TYPE I ... done. ==> CWD /pub/incoming/RGON ... done.
==> PORT ... done. ==> RETR anc_1m.OCT ...
Error in server response, closing control connection.
Retrying.
---
But when I use ftp ( ftp ftp.ngdc.noaa.gov), I can get data.
My computer is linux system version : 2.4.18-10smp #smp i686 unknow
wget verion : GNU wget 1.8.1
I have a script file use "wget" to get data files automatic every
month,. when my computer was linux version (6.2), "wget" did work well.
Since I update linux version (7.4), "wget" didn't work as above .
Thank you for your help.

--
==
Jing Ping Ye Email: [EMAIL PROTECTED]
 Phone: 303 497 3713
National Geophysical Data Center
CIRES, University of Colorado, Boulder, CO 80309
==

wget bug (?): --page-requisites should supercede robots.txt

2002-09-22 Thread Jamie Flournoy


Using wget 1.8.2:

$ wget --page-requisites http://news.com.com

...fails to retrieve most of the files that are required to properly 
render the HTML document, because they are forbidden by 
http://news.com.com/robots.txt .

I think that use of --page-requisites implies that wget is being used as 
a save this entire web page as... utility for later human viewing, 
rather than a text indexing spider that wants to analyze the content but 
not the presentation. So I believe that wget should ignore robots.txt 
when --page-requisites is specified.

If you agree then I'll try to write a patch  send it to you this 
week... please let me know if you agree or disagree. Thanks!


--- the gory bits:

   wget -d --page-requisites http://news.com.com; says:

appending http://news.com.com/i/hdrs/ne/y_fd.gif; to urlpos.

   etc., but then later says:

Deciding whether to enqueue http://news.com.com/i/hdrs/ne/y_fd.gif;.
Rejecting path i/hdrs/ne/y_fd.gif because of rule `i/'.
Not following http://news.com.com/i/hdrs/ne/y_fd.gif because robots.txt 
forbids it.
Decided NOT to load it.

Wget Bug: Re: not downloading everything with --mirror

2002-08-15 Thread Max Bowsher


Funk Gabor wrote:
 HTTP does not provide a dirlist command, so wget parses html to find
 other files it should download. Note: HTML not XML. I suspect that
 is the problem.

 If wget wouldn't download the rest, I'd say that too. But 1st the dir
 gets created, the xml is dloaded (in some other directory some *.gif
 too) so wget senses the directory. If I issue the wget -m site/dir
 then all of the rest comes down, (index.html?D=A and others too) so
 wget is able to get everything but not at once. So there would be no
 technical limitation for wget to make it happen in one step. So it is
 either a missing feature (shall I say, a bug as wget can't do the
 mirror which it could've) or I was unable to find some switch which
 makes it happen at once.

Hmm, now I see. The vast majority of websites are configured to deny directory
viewing. That is probably why wget doesn't bother to try, except for the
directory specified as the root of the download. I don't think there is any
option to do this for all directories, because its not really needed. The _real_
bug is that wget is failing to parse what look like valid img ... src=...
... tags. Perhaps someone more familiar with wget's html parsing code could
investigate? The command is: wget -r -l0 www.jeannette.hu/saj.htm and ignored
files are a number of image files.

Max.

Wget bug: 32 bit int for bytes downloaded.

2002-08-04 Thread Rogier Wolff



It seems wget uses a 32 bit integer for the bytes downloaded:

[...]
FINISHED --17:11:26--
Downloaded: 1,047,520,341 bytes in 5830 files
cave /home/suse8.0# du -s
5230588 .
cave /home/suse8.0# 

As it's a once per download variable I'd say it's not that performance
critical...

Roger.

WGET BUG

2002-07-07 Thread Kempston




 Hi, i have a problem and would 
really like you to help me. i`m using wget for downloading list of file 
urlsvia http proxy. When proxy server goes 
offline - wget doesn`t retry downloading of files. Can you fix that or can you 
tell me how can i fix that ?

WGET BUG

2002-07-07 Thread Kempston




Like That

Connecting to 195.108.41.140:3128... failed: 
Connection 
refused. 
--01:19:23-- ftp://kempston:*password*@194.151.106.227:15003/Dragon 
= 
`dragon.001' 
Connecting to 195.108.41.140:3128... failed: Connection 
refused. 
--01:19:23-- ftp://kempston:*password*@194.151.106.227:15003/Dragon 
= 
`dragon.002 
Connecting to 195.108.41.140:3128... failed: Connection 
refused. 
--01:19:23-- ftp://kempston:*password*@194.151.106.227:15003/Dragon 
= 
`dragon.003 
Connecting to 195.108.41.140:3128... failed: Connection 
refused. 
--01:19:23-- ftp://kempston:*password*@194.151.106.227:15003/Dragon 
= 
`dragon.004 
Connecting to 195.108.41.140:3128... failed: Connection 
refused. 
 
FINISHED 
--01:19:23-- 
Downloaded: 150,000,000 bytes in 10 files

- Original Message - 

  From: 
  Kempston 
  To: [EMAIL PROTECTED] 
  Sent: Monday, July 08, 2002 12:50 
AM
  Subject: WGET BUG
  
   Hi, i have a problem and would 
  really like you to help me. i`m using wget for downloading list of file 
  urlsvia http proxy. When proxy server 
  goes offline - wget doesn`t retry downloading of files. Can you fix that or 
  can you tell me how can i fix that ?

Re: wget bug (overflow)

2002-04-15 Thread Hrvoje Niksic


I'm afraid that downloading files larger than 2G is not supported by
Wget at the moment.

wget bug (overflow)

2002-02-26 Thread Vasil Dimov


fbsd1 --- http wget eshop.tar (3.3G) --- fbsd2

command was:

# wget http://kamenica/eshop.tar

at the second G i got the following:

2097050K .. .. .. .. ..  431.03 KB/s
2097100K .. .. .. .. ..8.14 MB/s
2097150K .. .. .. .. ..3.76 MB/s
-2097104K .. .. .. .. ..   12.21 MB/s
-2097054K .. .. .. .. ..8.14 MB/s
...

so i did nothing, seeing that everything continues normally.

but at the end i got:

-684104K .. .. .. .. ..1.74 MB/s
-684054K   0.00 B/s
assertion bytes = 0 failed: file retr.c, line 254
Abort trap (core dumped)

# wget -V
GNU Wget 1.8.1

# uname -a
FreeBSD vihren.etrade.xx 4.5-STABLE FreeBSD 4.5-STABLE #0: Sat Feb 23 16:54:34 EET 
2002 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/VIHREN  i386

im not sending u the wget.core because the problem is obvious, according
to me. i can repeat that hack if u want that wget.core or some more
debbuging info.
file is 3594496000 bytez and was copied successfully:

kamenica:~# md5 eshop.tar
MD5 (eshop.tar) = f1709dcad40073b8c8624a8e100d7697

vihren:~# md5 eshop.tar
MD5 (eshop.tar) = f1709dcad40073b8c8624a8e100d7697

Re: wget bug?!

2002-02-19 Thread TD - Sales International Holland B.V.


On Monday 18 February 2002 17:52, you wrote:

That would be great. The prob is that I'm using it to retrieve files mostly 
on servers that are having too much users. No I don't want to hammer the 
server but I do want to keep on trying with reasonable intervals until I get 
the file.

I think the feature would be usuable in other scenarios as well. You now have 
--waitretry and --wait, in my personal opinion the best would perhaps be to 
add --waitint(er)(val) or perhaps just --int(er)(val)

Anyways, thanks for the reply.

Kind regards,

Ferry van Steen

 [The message I'm replying to was sent to [EMAIL PROTECTED]. I'm
 continuing the thread on [EMAIL PROTECTED] as there is no bug and
 I'm turning it into a discussion about features.]

 On 18 Feb 2002 at 15:14, TD - Sales International Holland B.V. wrote:
  I've tried -w 30
  --waitretry=30
  --wait=30 (I think this one is for multiple files and the time in between
  those though)
 
  None of these seem to make wget wanna wait for 30 secs before trying
  again. Like this I'm hammering the server.

 The --waitretry option will wait for 1 second for the first retry,
 then 2 seconds, 3 seconds, etc. up to the value specified. So you
 may consider the first few retry attempts to be hammering the
 server but it will gradually back off.

 It sounds like you want an option to specify the initial retry
 interval (currently fixed at 1 second), but Wget currently has no
 such option, nor an option to change the amount it increments by
 for each retry attempt (also currently fixed at 1 second).

 If such features were to be added, perhaps it could work something
 like this:

 --waitretry=n - same as --waitretry=n,1,1
 --waitretry=n,m   - same as --waitretry=n,m,1
 --waitretry=n,m,i - wait m seconds for the first retry,
 incrementing by i seconds for subsequent
 retries up to a maximum of n seconds

 The disadvantage of doing it that way is that no-one will remember
 which order the numbers should appear, so an alternative is to
 leave --waitretry alone and supplement it with --waitretryfirst
 and --waitretryincr options.

Re: wget bug?!

2002-02-18 Thread Ian Abbott


[The message I'm replying to was sent to [EMAIL PROTECTED]. I'm
continuing the thread on [EMAIL PROTECTED] as there is no bug and
I'm turning it into a discussion about features.]

On 18 Feb 2002 at 15:14, TD - Sales International Holland B.V. wrote:

 I've tried -w 30
 --waitretry=30
 --wait=30 (I think this one is for multiple files and the time in between 
 those though)
 
 None of these seem to make wget wanna wait for 30 secs before trying again. 
 Like this I'm hammering the server.

The --waitretry option will wait for 1 second for the first retry,
then 2 seconds, 3 seconds, etc. up to the value specified. So you
may consider the first few retry attempts to be hammering the
server but it will gradually back off.

It sounds like you want an option to specify the initial retry
interval (currently fixed at 1 second), but Wget currently has no
such option, nor an option to change the amount it increments by
for each retry attempt (also currently fixed at 1 second).

If such features were to be added, perhaps it could work something
like this:

--waitretry=n - same as --waitretry=n,1,1
--waitretry=n,m   - same as --waitretry=n,m,1
--waitretry=n,m,i - wait m seconds for the first retry,
incrementing by i seconds for subsequent
retries up to a maximum of n seconds

The disadvantage of doing it that way is that no-one will remember
which order the numbers should appear, so an alternative is to
leave --waitretry alone and supplement it with --waitretryfirst
and --waitretryincr options.

Re: [Wget]: Bug submission

2001-12-29 Thread Hrvoje Niksic


[ Please mail bug reports to [EMAIL PROTECTED], not to me directly. ]

Nuno Ponte [EMAIL PROTECTED] writes:

 I get a segmentation fault when invoking:
 
 wget -r
 http://java.sun.com/docs/books/performance/1st_edition/html/JPTOC.fm.html
 
 My Wget version is 1.7-3, the one which is bundled with RedHat
 7.2. I attached my .wgetrc.

Wget 1.7 is fairly old -- it was followed by a bugfix 1.7.1 release,
and then 1.8 and 1.8.1.  Please try upgrading to the latest version,
1.8.1, and see if the bug repeats.  I couldn't repeat it with 1.8.1.

wget bug

2001-10-10 Thread Muthu Swamy



HI,
When I try to send a page to Nextel mobileusing the following command from unix box, 
"wget http://www.nextel.com/cgi-bin/sendPage.cgi?to01=4157160856%26message=hellothere%26action=send"
The wget returns the following message but the page is not reaching the phone.
"--15:59:16-- http://www.nextel.com:80/cgi-bin/sendPage.cgi?to01=4157160856mess
age=hellothereaction=send
= `sendPage.cgi?to01=4157160856message=hellothereaction=send'
Location: http://messaging.nextel.com/cgi/mPageExt.dll?buildIndAddressPageentry
=1 [following]
--15:59:16-- http://messaging.nextel.com:80/cgi/mPageExt.dll?buildIndAddressPag
eentry=1
= `mPageExt.dll?buildIndAddressPageentry=1.14'
Length: unspecified [text/html]
0K - .
15:59:16 (75.02 KB/s) - `mPageExt.dll?buildIndAddressPageentry=1.14' saved [998
6]
But when I send page from Nextel.com web site, it reaches my cell phone.
I thought you would help me out.
Highly would be appreciated your valuable help
Thanks,
MuthuGet your FREE download of MSN Explorer at http://explorer.msn.com

wget bug

2001-10-08 Thread Dmitry . Karpov


Dear sir.

When I out to my browser (NN'3) line
http://find.infoart.ru/cgi-bin/yhs.pl?hidden=http%3A%2F%2F194.67.26.82word=FreeBSD
wget working correctly.

When I put this line to wget, wget change this line;
argument hidden is http:/194.67.26.82word,
argument word is empty. Where I am wrong?

Re: maybe wget bug

2001-04-23 Thread Hrvoje Niksic


Hack Kampbjørn [EMAIL PROTECTED] writes:

 You have hit one of Wget features, it is overzealous in converting
 URLs into canonical form. As you have discovered Wget first converts
 all encoded characters back to their real value and then encodes all
 those that are unsafe sending in URLs.

It's a bug.  The correct solution has been proposed by Anon
Sricharoenchai and I've implemented the function, but it will take
some time to integrate it into Wget.

maybe wget bug

2001-04-04 Thread David Christopher Asher


Hello,

I am using wget to invoke a CGI script call, while passing it several
variables.  For example:

wget -O myfile.txt
"http://user:[EMAIL PROTECTED]/myscript.cgi?COLOR=blueSHAPE=circle"

where myscript.cgi say, makes an image based on the parameters "COLOR" and
"SHAPE".  The problem I am having is when I need to pass a key/value pair
where the value contains the "" character.  Such as:

wget -O myfile.txt "http://user:[EMAIL PROTECTED]/myscript.cgi?COLOR=blue
 redSHAPE=circle"

I have tried encoding the "" as %26, but that does not seem to work (spaces
as %20 works fine).  The error log for the web server shows that the URL
requested does not say %26, but rather "".  It does not appear to me that
wget is sending the %26 as %26, but perhaps "fixing" it to "".

I am using GNU wget v1.5.3 with Red Hat 7.0

Thanks!

--
David Christopher Asher

wget bug - after closing control connection

2001-03-08 Thread Cezary Sobaniec


Hello,

I've found a (less important) bug in wget.  I've been dowloading 
a file from FTP server and the control connection of the FTP service 
was closed by the server.  After that wget started to print incorrectly
progress information (beyond 100%).

The log follows:
_

# wget -nd ftp://ftp.suse.com/pub/suse/i386/update/7.0/n1/mod_php.rpm
--12:30:48--  ftp://ftp.suse.com:21/pub/suse/i386/update/7.0/n1/mod_php.rpm
   = `mod_php.rpm'
Connecting to ftp.suse.com:21... connected!
Logging in as anonymous ... Logged in!
== TYPE I ... done.  == CWD pub/suse/i386/update/7.0/n1 ... done.
== PORT ... done.== RETR mod_php.rpm ... done.
Length: 1,599,213 (unauthoritative)

0K - .. .. .. .. .. [  3%]
   50K - .. .. .. .. .. [  6%]
  100K - .. .. .. .. .. [  9%]
  150K - .. .. .. .. .. [ 12%]
  200K - .. .. .. .. .. [ 16%]
  250K - .. .. .. .. .. [ 19%]
  300K - .. .. .. .. .. [ 22%]
  350K - .. .. .. .. .. [ 25%]
  400K - .. .. .. .. .. [ 28%]
  450K - .. .. .. .. .. [ 32%]
  500K - .. .. .. .. .. [ 35%]
  550K - .. .. .[ 36%]

12:41:36 (916.90 B/s) - Control connection closed.
Retrying.

--12:50:38--  ftp://ftp.suse.com:21/pub/suse/i386/update/7.0/n1/mod_php.rpm
  (try: 3) = `mod_php.rpm'
Connecting to ftp.suse.com:21... connected!
Logging in as anonymous ... Logged in!
== TYPE I ... done.  == CWD pub/suse/i386/update/7.0/n1 ... done.
== PORT ... done.== REST 626688 ... done.
== RETR mod_php.rpm ... done.
Length: 972,525 [345,837 to go] (unauthoritative)

  [ skipping 600K ]
  600K - ,, ,, .. .. .. [ 68%]
  650K - .. .. .. .. .. [ 72%]

12:57:59 (187.36 B/s) - Control connection closed.
Retrying.

--12:57:59--  ftp://ftp.suse.com:21/pub/suse/i386/update/7.0/n1/mod_php.rpm
  (try: 4) = `mod_php.rpm'
Connecting to ftp.suse.com:21... connected!
Logging in as anonymous ... Logged in!
== TYPE I ... done.  == CWD pub/suse/i386/update/7.0/n1 ... done.
== PORT ... done.== REST 708608 ... done.
== RETR mod_php.rpm ... done.
Length: 890,605 [181,997 to go] (unauthoritative)

  [ skipping 650K ]
  650K - ,, ,, ,, ,, ,, [ 80%]
  700K - .. .. .. .. .. [ 86%]
  750K - .. .. .. .. .. [ 91%]
  800K - .. .. .. .. .. [ 97%]
  850K - .. .. .. .. .. [103%]
  900K - .. .. .. .. .. [109%]
  950K - .. .. .. .. .. [114%]
 1000K - .. .. .. .. .. [120%]
 1050K - .. .. .. .. .. [126%]
 1100K - .. .. .. .. .. [132%]
 1150K - .. .. .. .. .. [137%]

-- 
("`-''-/").___..--''"`-._  Cezary Sobaniec
 `6_ 6  )   `-.  ( ).`-.__.')  Institute of Computing Science
 (_Y_.)'  ._   )  `._ `. ``-..-'   Poznan University of Technology
   _..`--'_..-_/  /--'_.' ,'   [EMAIL PROTECTED]
  (il).-''  (li).'  ((!.-' tel. (+48 61) 665-28-09

Re: wget bug - after closing control connection

2001-03-08 Thread csaba . raduly



Which version of wget do you use ? Are you aware that wget 1.6 has been
released and 1.7 is in development (and they contain a workaround for the
"Lying FTP server syndrome" you are seeing) ?
--
Csaba Rduly, Software Engineer  Sophos Anti-Virus
email: [EMAIL PROTECTED]   http://www.sophos.com
US support: +1 888 SOPHOS 9UK Support: +44 1235 559933

78 matches

Mail list logo