Re: Help Newbie: Files required are not downloaded.

2006-03-21 Thread Mauro Tortonesi

Jim Wright wrote:

Wildcards don't work is the accepted wisdom.  I just realized that I
have been using downloads of the form --accept AB06*a.T00,AB06*a.BNX
for a long time and it works fine for me.  Should it not?


of course! i can't believe i wrote something so stupid. i was working on 
regex support, which will be included the next wget release, and for 
some reason (my brain was probably unplugged) i confused regex's with 
wildcards. i am sorry.


 Looking at the lines below, the reject encompasses all of the accept,
 so if reject is applied after accept, this may also explain the
 problem.

you're right, of course.

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: Help Newbie: Files required are not downloaded.

2006-03-21 Thread TeeJay
Mauro Tortonesi mauro.tortonesi at unife.it writes:

 
 Jim Wright wrote:
  Wildcards don't work is the accepted wisdom.  I just realized that I
  have been using downloads of the form --accept AB06*a.T00,AB06*a.BNX
  for a long time and it works fine for me.  Should it not?
 
 of course! i can't believe i wrote something so stupid. i was working on 
 regex support, which will be included the next wget release, and for 
 some reason (my brain was probably unplugged) i confused regex's with 
 wildcards. i am sorry.
 
 you're right, of course.

==
Thank you very much I've got the test version working now. :-)
Right, now I just have to migrate it into the live environment...
I may be back but in case not, I appreciate the help.

Regards,
TeeJay

==




Re: Help Newbie: Files required are not downloaded.

2006-03-20 Thread Mauro Tortonesi

TeeJay wrote:


Help for a newbie please.

I have created my wgetrc file. It contains the following variables:

input = C:\WGET\source.txt
user =  (crossed out for security)
password =  (crossed out for security)
check_certificate = off
recursive = on
reclevel = 2
no_parent = off
dirstruct = off
html_extension = off
accept = *06*BATCH*VAT.CGL
reject = *.CGL,*.CHK
quiet = on
server_response = on
logfile = C:\WGET\Wgetlog.log

When I run in debug mode (wget -d), my log file shows the directory listing of 
all files from the site specified.


Yet after deciding whether to enqueue those specified by the accept 
command variable, Decided NOT to load it is always the result.


Have I done something wrong with the wgetrc file or it's variables?


you are using wildcard to specify which files to accept or reject, and 
wget does not support them.


--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: Help Newbie: Files required are not downloaded.

2006-03-20 Thread TeeJay
Mauro Tortonesi mauro.tortonesi at unife.it writes:

you are using wildcard to specify which files to accept or reject, and wget 
does not support them.

-
Thanks for the quick response Mauro!  
I am surprised at the response though because from the wget.html doc I read 
the following:

[Quote]
accept = acclist 
The argument to --accept option is a list of file suffixes or patterns that 
Wget will download during recursive retrieval. A suffix is the ending part of 
a file, and consists of normal letters, e.g. gif or .jpg. A matching pattern 
contains shell-like wildcards, e.g. books* or zelazny*196[0-9]*. 
So, specifying wget -A gif,jpg will make Wget download only the files ending 
with gif or jpg, i.e. GIFs and JPEGs. On the other hand, wget -A zelazny*196
[0-9]* will download only files beginning with zelazny and containing numbers 
from 1960 to 1969 anywhere within. Look up the manual of your shell for a 
description of how pattern matching works. 

Of course, any number of suffixes and patterns can be combined into a comma-
separated list, and given as an argument to -A.
[EndQuote] 

Has this ability to specify patterns changed? Is there any way else I can 
specify the types of files I want i.e. only those with file names after 
the accepted pattern?

BTW I forgot to mention that I'm using Wget Ver 1.10.2

Thanks in advance.

TeeJay



Re: Help Newbie: Files required are not downloaded.

2006-03-20 Thread TeeJay
Jim Wright jwright at unavco.org writes:

 
 Wildcards don't work is the accepted wisdom.  I just realized that I
 have been using downloads of the form --accept AB06*a.T00,AB06*a.BNX
 for a long time and it works fine for me.  Should it not?
 
 Looking at the lines below, the reject encompasses all of the accept,
 so if reject is applied after accept, this may also explain the problem.
 
 Jim
 
 On Mon, 20 Mar 2006, Mauro Tortonesi wrote:
 
   accept = *06*BATCH*VAT.CGL
   reject = *.CGL,*.CHK
  
  you are using wildcard to specify which files to accept or reject, and wget
  does not support them.
 
--
Good spot there Jim! My thanks. :-))
OK.
Changed reject variable to remove *.CGL and modified *.CHK to simply CHK.
Also amended the accept variable to simply CGL - this is not really what I 
want but I'm willing to try anything to first get a hit before I try to 
whittle it down further.

Happy to say I'm now getting a hit - well actually tons of them!! as the site 
contains files with names *Batch.CGL as well as *BatchVAT.CGL.

It is only the latter files that I want.

Any helpful hints? Or am going to have to resort to writing a batch file to 
sift the data?


As always, thanks for your knowlegdable inputs.

TeeJay.



Re: Help Newbie: Files required are not downloaded.

2006-03-20 Thread Jim Wright
I'd suggest using your original accept, and not using a reject.
You know specifically what you want, and all the rest will be ignored.

Jim


On Mon, 20 Mar 2006, TeeJay wrote:

 Jim Wright jwright at unavco.org writes:
 
  
  Wildcards don't work is the accepted wisdom.  I just realized that I
  have been using downloads of the form --accept AB06*a.T00,AB06*a.BNX
  for a long time and it works fine for me.  Should it not?
  
  Looking at the lines below, the reject encompasses all of the accept,
  so if reject is applied after accept, this may also explain the problem.
  
  Jim
  
  On Mon, 20 Mar 2006, Mauro Tortonesi wrote:
  
accept = *06*BATCH*VAT.CGL
reject = *.CGL,*.CHK
   
   you are using wildcard to specify which files to accept or reject, and 
   wget
   does not support them.
  
 --
 Good spot there Jim! My thanks. :-))
 OK.
 Changed reject variable to remove *.CGL and modified *.CHK to simply CHK.
 Also amended the accept variable to simply CGL - this is not really what I 
 want but I'm willing to try anything to first get a hit before I try to 
 whittle it down further.
 
 Happy to say I'm now getting a hit - well actually tons of them!! as the site 
 contains files with names *Batch.CGL as well as *BatchVAT.CGL.
 
 It is only the latter files that I want.
 
 Any helpful hints? Or am going to have to resort to writing a batch file to 
 sift the data?
 
 
 As always, thanks for your knowlegdable inputs.
 
 TeeJay.