Re: Updates on Wget Future Directions
Micah, But most importantly, what will the new name of Wget be?
Re: wget running in windows Vista
There are annoying issues with UAC and virtually every piece of software, but the problems are not insurmountable. Liz: I`ll give this a try this weekend with the latest version of wget. I need to download it, anyway. - Original Message - From: Christopher G. Lewis [EMAIL PROTECTED] To: Alan Thomas [EMAIL PROTECTED]; Liz Labbe [EMAIL PROTECTED]; wget@sunsite.dk Sent: Friday, February 01, 2008 3:13 PM Subject: RE: wget running in windows Vista OK - ignore what I said - I've been immersed in vista security these days, so thought there might be some issues with wget and uac. Chris Christopher G. Lewis http://www.ChristopherLewis.com -Original Message- From: Alan Thomas [mailto:[EMAIL PROTECTED] Sent: Thursday, January 31, 2008 4:06 PM To: Liz Labbe; wget@sunsite.dk Subject: Re: wget running in windows Vista What version of wget? What edition of Vista? I have used the wget 1.10.2 on Vista before. Alan - Original Message - From: Christopher G. Lewis [EMAIL PROTECTED] To: Liz Labbe [EMAIL PROTECTED]; wget@sunsite.dk Sent: Thursday, January 31, 2008 8:22 AM Subject: RE: wget running in windows Vista On Vista, you probably have to run in an administrative command prompt. Christopher G. Lewis http://www.ChristopherLewis.com -Original Message- From: Liz Labbe [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 30, 2008 10:57 PM To: wget@sunsite.dk Subject: wget running in windows Vista I just downloaded WGET and am trying to get it to work under Window's Vista operating system. I cannot get it to connect: for example, I tried wget http://www.yahoo.com/ wget force_html=on http://www.yahoo.com/ etc. I consistently get the message connecting to www.yahoo.com|99.99.99.99|99., Failed...network is down. Is there an issue with Vista or am I doing something wrong? Thanks, Liz __ __ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs
Re: wget running in windows Vista
What version of wget? What edition of Vista? I have used the wget 1.10.2 on Vista before. Alan - Original Message - From: Christopher G. Lewis [EMAIL PROTECTED] To: Liz Labbe [EMAIL PROTECTED]; wget@sunsite.dk Sent: Thursday, January 31, 2008 8:22 AM Subject: RE: wget running in windows Vista On Vista, you probably have to run in an administrative command prompt. Christopher G. Lewis http://www.ChristopherLewis.com -Original Message- From: Liz Labbe [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 30, 2008 10:57 PM To: wget@sunsite.dk Subject: wget running in windows Vista I just downloaded WGET and am trying to get it to work under Window's Vista operating system. I cannot get it to connect: for example, I tried wget http://www.yahoo.com/ wget force_html=on http://www.yahoo.com/ etc. I consistently get the message connecting to www.yahoo.com|99.99.99.99|99., Failed...network is down. Is there an issue with Vista or am I doing something wrong? Thanks, Liz __ __ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs
wget2
What is wget2? Any plans to move to Java? (Of course, the latter will not be controversial. :) Alan
Re: Wget Name Suggestions
How about: -- JustGetIt? --GoFetch? --SuckItUp? -- Hoover? -- DragNet? -- NetSucker?
Re: wget2
Sorry for the misunderstanding. Honestly, Java would be a great language for what wget does. Lots of built-in support for web stuff. However, I was kidding about that. wget has a ton of great functionality, and I am a reformed C/C++ programmer (or a recent Java convert). But I love using wget! Alan
Re: Need help with wget from a password-protected URL
Quotation marks around the text containing special characters should work in Windows batch files. - Original Message - From: Tony Godshall [EMAIL PROTECTED] To: Uma Shankar [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Sunday, November 11, 2007 12:48 AM Subject: Re: Need help with wget from a password-protected URL sounds like a shell issue. assuming you are on a nix, try 'pass' (so shell passed the weird chars literally. If you are on Windows, it's another story. On 11/10/07, Uma Shankar [EMAIL PROTECTED] wrote: Hi - I've been struggling to download data from a protected site. The man pages intruct me to use the --http-user=USER and --http-passwd=PASS options when issuinig the wget command to the URL. I get error messages when wget encounters special chars in the password. Is there a way to get around this? I really need helpo downloading the data. Thanks, Uma Shankar, Research Associate Institute for the Environment Bank of America Plaza CB# 6116 137 E. Franklin St Room 644 Chapel Hill NC 27599-6116 Phone: (919) 966-2102 Fax (919) 843-3113 Mobile: (919) 441-9202 Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information? -T. S. Eliot (1888-1965) -- Best Regards. Please keep in touch.
Re: Software interface to wget
Micah, Yes, I was thinking of a library, not realizing how difficult that would be, as I have never looked at the wget source code. Also, I am new to Java. I know that there is a lot of built-in support for HTTP etc., but I`ve only used a few things. After looking at a couple of HTML and XHTML files, I think my needs might be met if I download them and make a few substitutions (hrefs, img src`s, etc.) for absolute or local file-based references. I wanted to avoid single downloads, so that is why the -O option will not suffice. Regardless, wget has a lot of nice features and plans for good improvements. While it won`t yet meet this one need, I will certainly continue to use it for other purposes. Thanks, Alan - Original Message - From: Micah Cowan [EMAIL PROTECTED] To: Alan Thomas [EMAIL PROTECTED] Cc: wget@sunsite.dk Sent: Thursday, October 04, 2007 12:55 PM Subject: Re: Software interface to wget -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Alan Thomas wrote: Idea for future wget versions: It would be nice if I could invoke wget programmatically and have options like returning data in buffers versus files (so data can be searched and/or manipulated in memory), This can already be done by using wget's -O switch, which directs the output to a specified file (including standard output). A wrapper program could simply read wget's stdout directly into a buffer. However, - -O is only really useful for single downloads, as there is no delineation between separate files. And, I'll admit that I'm not clear how easy this is to do with 100% Pure Java; it's quite straightforward on Unix systems in most languages. Then it could be more easily and seamlessly integrated into other software that needs this capability. I would especially like to be able to invoke wget from Java code. It sounds to me like you're asking for a library version of Wget. There aren't specific plans to support this at the moment, and I'm not sure how much it'd really buy you: high level programming languages such as Java, Python, Perl, etc, tend to ship with good HTTP and HTML-parsing libraries, in which case rigging your own code to do a good chunk of what Wget does, is probably less work than trying to adapt Wget into library form. I'm not saying I'm ruling it out, but I'd need to hear some good cases for it, in contrast to using what's already available on those platforms. However, some changes are in the works (early early planning stages) for Wget to sport a plugin architecture, and if a bit of glue to call out to higher-level languages is added, plugins written in languages such as Java wouldn't be a big sretch. It may well be that restructuring Wget as a library instead of as a standalone app that runs plugins, may be a better solution; it bears discussion. Also planned is a more flexible output system, allowing for arbitrary formatting of downloaded resources (such as .mht's, or tarballs, or whatever), making delineation in a single output stream possible; also, a metadata system for preserving information about what files have been completely downloaded and which were interrupted, what their original URLs were, etc. All of this, however, is a long way from even really being started, especially given our current developer resources. - -- HTH¸ Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/
Re: wget syntax problem ?
command.com By the way, Josh and your messages are being put out to the list in dupicates (at least, that`s what I`m seeing on my end). - Original Message - From: Micah Cowan [EMAIL PROTECTED] To: Alan Thomas [EMAIL PROTECTED] Cc: wget@sunsite.dk Sent: Thursday, September 06, 2007 9:34 PM Subject: Re: wget syntax problem ? -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Alan Thomas wrote: Please ignore. It was needing the \\, like Josh said. Out of curiosity, what command interpreter were you using? Was this command.com, or something else like rxvt/Cygwin? - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD4DBQFG4Kqe7M8hyUobTrERCI3HAJjw+g0GsGE1b+6vhr+pu/QJAQIuAJ4o2UbP e3qqbx+ywsdRpTuIbx6VPQ== =792z -END PGP SIGNATURE-
Re: wget syntax problem ?
Please ignore. It was needing the \\, like Josh said. - Original Message - From: Alan Thomas [EMAIL PROTECTED] To: Josh Williams [EMAIL PROTECTED]; wget@sunsite.dk Sent: Thursday, September 06, 2007 9:25 PM Subject: Re: wget syntax problem ? Wget does not like my use of the --directory-prefix= option. Anyone know why? - Original Message - From: Josh Williams [EMAIL PROTECTED] To: Alan Thomas [EMAIL PROTECTED] Cc: wget@sunsite.dk Sent: Thursday, September 06, 2007 8:53 PM Subject: Re: wget syntax problem ? On 9/6/07, Alan Thomas [EMAIL PROTECTED] wrote: I know this is probably something simple I screwed up, but the following commands in a Windows batch file return the error Bad command or file name for the wget command cd .. wget --convert-links --directory-prefix=C:\WINDOWS\Profiles\Alan000\Desktop\wget\CNN\ --no-clobber http://www.cnn.com; Don't use backslashes in filenames. If you do, use `\\` instead.
Re: wget syntax problem ?
Wget does not like my use of the --directory-prefix= option. Anyone know why? - Original Message - From: Josh Williams [EMAIL PROTECTED] To: Alan Thomas [EMAIL PROTECTED] Cc: wget@sunsite.dk Sent: Thursday, September 06, 2007 8:53 PM Subject: Re: wget syntax problem ? On 9/6/07, Alan Thomas [EMAIL PROTECTED] wrote: I know this is probably something simple I screwed up, but the following commands in a Windows batch file return the error Bad command or file name for the wget command cd .. wget --convert-links --directory-prefix=C:\WINDOWS\Profiles\Alan000\Desktop\wget\CNN\ --no-clobber http://www.cnn.com; Don't use backslashes in filenames. If you do, use `\\` instead.
Re: Cannot write to auto-generated file name
This seems to work up to and including 259 characters in the filename (not counting the file extension) on Windows (98). Alan - Original Message - From: Alan Thomas [EMAIL PROTECTED] To: wget@sunsite.dk Sent: Monday, April 02, 2007 10:19 PM Subject: Cannot write to auto-generated file name Is this an operating system issue or wget? I am using wget version 1.10.2b on Windows 98 and XP. I had this problem on both operating systems. I downloaded the binaries from http://www.christopherlewis.com/. I created and executed the following batch file containing a wget command: wget --convert-links --directory-prefix=C:\Program Files\wget\test --no-clobber --output-file=no_work_logfile.txt http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2Sect2=HITOFFu=/netaht ml/PTO/search-adv.htmlr=0p=1f=Sl=50Query=isd/01/01/2000-12/31/2010+and +(aclm/(software+and+hardware)+or+ttl/(software+and+hardware)+or+spec/(softw are+and+hardware))d=PTXT In this case, it does not like the automatically-generated filename. The following is the resulting logfile: --21:50:18-- http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2Sect2=HITOFFu=/netahtml/PTO/search-adv.htmlr=0p=1f=Sl=50Query=isd/01/01/2000-%3E12/31/2010+and+(aclm/(software+and+hardware)+or+ttl/(software+and+hardware)+or+spec/(software+and+hardware))d=PTXT = `C:/Program Files/wget/test/[EMAIL PROTECTED]Sect2=HITOFFu=%2Fnetahtml%2FPTO%2Fsea rch-adv.htmlr=0p=1f=Sl=50Query=isd%2F01%2F01%2F2000-%3E12%2F31%2F2010+a nd+(aclm%2F(software+and+hardware)+or+ttl%2F(software+and+hardware)+or+spec% 2F(software+and+hardware))d=PTXT' Resolving patft.uspto.gov... 151.207.240.33, 151.207.240.26, 151.207.240.23 Connecting to patft.uspto.gov|151.207.240.33|:80... connected. HTTP request sent, awaiting response... 200 Script results follow Length: unspecified [text/html] C:/Program Files/wget/test/[EMAIL PROTECTED]Sect2=HITOFFu=%2Fnetahtml%2FPTO%2Fsea rch-adv.htmlr=0p=1f=Sl=50Query=isd%2F01%2F01%2F2000-%3E12%2F31%2F2010+a nd+(aclm%2F(software+and+hardware)+or+ttl%2F(software+and+hardware)+or+spec% 2F(software+and+hardware))d=PTXT: No such file or directory Cannot write to `C:/Program Files/wget/test/[EMAIL PROTECTED]Sect2=HITOFFu=%2Fnetahtml%2FPTO%2Fsea rch-adv.htmlr=0p=1f=Sl=50Query=isd%2F01%2F01%2F2000-%3E12%2F31%2F2010+a nd+(aclm%2F(software+and+hardware)+or+ttl%2F(software+and+hardware)+or+spec% 2F(software+and+hardware))d=PTXT' (No such file or directory). C:/Program Files/wget/test/[EMAIL PROTECTED]Sect2=HITOFFu=%2Fnetahtml%2FPTO%2Fsea rch-adv.htmlr=0p=1f=Sl=50Query=isd%2F01%2F01%2F2000-%3E12%2F31%2F2010+a nd+(aclm%2F(software+and+hardware)+or+ttl%2F(software+and+hardware)+or+spec% 2F(software+and+hardware))d=PTXT: No such file or directory Converting C:/Program Files/wget/test/[EMAIL PROTECTED]Sect2=HITOFFu=%2Fnetahtml%2FPTO%2Fsea rch-adv.htmlr=0p=1f=Sl=50Query=isd%2F01%2F01%2F2000-%3E12%2F31%2F2010+a nd+(aclm%2F(software+and+hardware)+or+ttl%2F(software+and+hardware)+or+spec% 2F(software+and+hardware))d=PTXT... nothing to do. Converted 1 files in 0.000 seconds. However, when I eliminate the third field (spec) in the request, which shortens the length of the filename, like this: wget --convert-links --directory-prefix=C:\Program Files\wget\test --no-clobber --output-file=works_logfile.txt http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2Sect2=HITOFFu=/netaht ml/PTO/search-adv.htmlr=0p=1f=Sl=50Query=isd/01/01/2000-12/31/2010+and +(aclm/(software+and+hardware)+or+ttl/(software+and+hardware))d=PTXT it works fine. Its logfile is: --21:50:06-- http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2Sect2=HITOFFu=/netahtml/PTO/search-adv.htmlr=0p=1f=Sl=50Query=isd/01/01/2000-%3E12/31/2010+an d+(aclm/(software+and+hardware)+or+ttl/(software+and+hardware))d=PTXT = `C:/Program Files/wget/test/[EMAIL PROTECTED]Sect2=HITOFFu=%2Fnetahtml%2FPTO%2Fsea rch-adv.htmlr=0p=1f=Sl=50Query=isd%2F01%2F01%2F2000-%3E12%2F31%2F2010+a nd+(aclm%2F(software+and+hardware)+or+ttl%2F(software+and+hardware))d=PTXT' Resolving patft.uspto.gov... 151.207.240.23, 151.207.240.33, 151.207.240.26 Connecting to patft.uspto.gov|151.207.240.23|:80... connected. HTTP request sent, awaiting response... 200 Script results follow Length: unspecified [text/html] 0K .. .. .. .. .. 63.36 KB/s 50K .. 123.64 KB/s 21:50:12 (71.20 KB/s) - `C:/Program Files/wget/test/[EMAIL PROTECTED]Sect2=HITOFFu=%2Fnetahtml%2FPTO%2Fsea rch-adv.htmlr=0p=1f=Sl=50Query=isd%2F01%2F01%2F2000-%3E12%2F31%2F2010+a nd+(aclm%2F(software+and+hardware)+or+ttl%2F(software+and+hardware))d=PTXT' saved [66128] Converting C:/Program Files/wget/test/[EMAIL PROTECTED]Sect2=HITOFFu=%2Fnetahtml%2FPTO%2Fsea rch-adv.htmlr=0p=1f=Sl=50Query=isd%2F01%2F01%2F2000-%3E12%2F31%2F2010+a nd+(aclm
Special characters in http
I am using wget 1.10.2 on a Windows 98 machine. I would like to non-interactively query the U.S. patent database. I am using the following wget command: wget --convert-links --directory-prefix=C:\Program Files\wget\perimeter --no-clobber http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2Sect2=HITOFFu=/netahtml/PTO/search-adv.htmlr=0p=1f=Sl=50Query=isd/1/1/2000-1/1/2010+and+(aclm/(software+and+hardware)+or+ttl/(software+and+hardware))d=PTXT However, this query is seen by the server as: isd/1/1/2000-/1/2010 and (aclm/(software and hardware) or ttl/(software and hardware)) So, the character is being translated as /, which the server does not like (no matches returned). However, if I open the link below directly in my browser, it works fine: http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2Sect2=HITOFFu=/netahtml/PTO/search-adv.htmlr=0p=1f=Sl=50Query=isd/1/1/2000-1/1/2010+and+(aclm/(software+and+hardware)+or+ttl/(software+and+hardware))d=PTXT What is happening? Is the problem due to the fact that the character is reserved in HTML? Is there something that I should do differently? I am still a novice to wget and http. I have looked in the wget manual (probably not strictly a wget question) and on the web, but have not found where this is discussed. Thanks, Alan
Re: Special characters in http
Substituting %3E for does not work either. - Original Message - From: Alan Thomas To: wget@sunsite.dk Sent: Saturday, March 31, 2007 3:23 AM Subject: Special characters in http I am using wget 1.10.2 on a Windows 98 machine. I would like to non-interactively query the U.S. patent database. I am using the following wget command: wget --convert-links --directory-prefix=C:\Program Files\wget\perimeter --no-clobber http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2Sect2=HITOFFu=/netahtml/PTO/search-adv.htmlr=0p=1f=Sl=50Query=isd/1/1/2000-1/1/2010+and+(aclm/(software+and+hardware)+or+ttl/(software+and+hardware))d=PTXT However, this query is seen by the server as: isd/1/1/2000-/1/2010 and (aclm/(software and hardware) or ttl/(software and hardware)) So, the character is being translated as /, which the server does not like (no matches returned). However, if I open the link below directly in my browser, it works fine: http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2Sect2=HITOFFu=/netahtml/PTO/search-adv.htmlr=0p=1f=Sl=50Query=isd/1/1/2000-1/1/2010+and+(aclm/(software+and+hardware)+or+ttl/(software+and+hardware))d=PTXT What is happening? Is the problem due to the fact that the character is reserved in HTML? Is there something that I should do differently? I am still a novice to wget and http. I have looked in the wget manual (probably not strictly a wget question) and on the web, but have not found where this is discussed. Thanks, Alan
Re: Special characters in http
Steven, Thanks! Putting quotes around the http request worked. Alan - Original Message - From: Steven M. Schweda [EMAIL PROTECTED] To: WGET@sunsite.dk Cc: [EMAIL PROTECTED] Sent: Saturday, March 31, 2007 8:27 AM Subject: Re: Special characters in http From: Alan Thomas What is happening? [...] I'm no Windows expert, but, as you said, these are special characters. Have you tried quoting the URL? In UNIX, apostrophes and quotation marks are popular; in VMS, quotation marks; in Windows, at least one of those should be effective. Note that you may need to use -O, because otherwise the wget-generated output file name may be too ugly for your file system. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
wildcards in filenames
Forgive me, as I have no experience with Unix, and this might be the source of my problem. . . . How do I specify that I only want, e.g., HTML files that begin with a certain letter from a directory? Putting /l*.htm at the end of the URL did not work: Warning: wildcards not supported in HTTP. Putting l*.htm after the URL (separated with a space) did not work either. Thanks, Alan
Re: Question re web link conversions
Steven, I`m not trying to blame wget, but rather understand what is going on and perhaps how to correct it. I am using wget version 1.10.2 and Internet Explorer 6.0.2800.1106 on Windows 98SE. However, when I renamed the file, this problem did not occur. So, I think it was something to do with the characters in the filename, which you mentioned. Thanks, Alan - Original Message - From: Steven M. Schweda [EMAIL PROTECTED] To: WGET@sunsite.dk Cc: [EMAIL PROTECTED] Sent: Tuesday, March 13, 2007 1:23 AM Subject: Re: Question re web link conversions From: Alan Thomas As usual, wget without a version does not adequately describe the wget program you're using, Internet Explorer without a version does not adequately describe the Web browser you're using, and I can only assume that you're doing all this on some version or other of Windows. It might help to know which of everything you're using. (But it might not.) Using GNU Wget 1.10.2c built on VMS Alpha V7.3-2 (wget -V), I had no such trouble with either a Mozilla or an old Netscape 3 browser. (I did need to rename the resulting file to something with fewer exotic characters before I could get either browser to admit that the file existed, but it's hard to see how that could matter much.) It's not obvious to me how any browser could invent a URL to which to go Back, so my first guess is operator error, but it's even less obvious to me how anything wget could do could cause this behavior, either. You might try it with Firefox or any browser with no history which might confuse a Back button. If there's a way to blame wget for this, I'll be amazed. (That has happened before, however.) Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Question re web link conversions
I am using the wget command below to get a page from the U.S. Patent Office. This works fine. However, when I open the resulting local file with Internet Explorer (IE), click a link in the file (go to another web site) and the click Back, it goes back to the real web address (http:...) vice the local file (c:\program files\wget\patents\ . . .). Does this have something to do with how wget converts web links? Is there something I should do differently with wget? I`m not clear on why it would do this. When I save this site directly from IE as an HTML file, it works fine. (When I click back, it goes back to the local file.) Thanks, Alan wget --convert-links --directory-prefix=C:\Program Files\wget\patents --no-clobber http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO2Sect2=HITOFFu=/netahtml/PTO/search-adv.htmlr=0p=1f=Sl=50Query=ttl/softwared=PG01
Re: Naming output file
Oh. For some reason, I thought this was just a logfile.Thanks, Alan - Original Message - From: Steven M. Schweda [EMAIL PROTECTED] To: WGET@sunsite.dk Cc: [EMAIL PROTECTED] Sent: Saturday, March 10, 2007 11:44 PM Subject: Re: Naming output file From: Alan Thomas Is there a way to tell wget how to name an output file (i.e., not what it is named by the site from which I am retrieving). -O, --output-document=FILEwrite documents to FILE. Note that using -O has some side effects which bother some users. Steven M. Schweda [EMAIL PROTECTED] 382 South Warwick Street(+1) 651-699-9818 Saint Paul MN 55105-2547
Syntax for directory names
I would like the output files from a wget command to be stored in a subdirectory (output) of the current directory. To do this, I used the directory-prefix option, but it does not seem to like my syntax. It gives me the error missing URL. The variations I tried for the option are: --directory-prefix=/output/ --directory-prefix=/output --directory-prefix=output Here is one variation of the complete command I am using: wget --convert-links --directory-prefixoutput --no-clobber http://www.xyz.com Anybody know what I`m doing wrong? Thanks, Alan
Re: Syntax for directory names
Never mind. The correct syntax is --directory-prefix=c:\output\ However, the error was due to a Return vice space before the URL. Duh! Alan - Original Message - From: Alan Thomas To: wget@sunsite.dk Sent: Saturday, March 10, 2007 9:07 PM Subject: Syntax for directory names I would like the output files from a wget command to be stored in a subdirectory (output) of the current directory. To do this, I used the directory-prefix option, but it does not seem to like my syntax. It gives me the error missing URL. The variations I tried for the option are: --directory-prefix=/output/ --directory-prefix=/output --directory-prefix=output Here is one variation of the complete command I am using: wget --convert-links --directory-prefixoutput --no-clobber http://www.xyz.com Anybody know what I`m doing wrong? Thanks, Alan
Naming output file
Is there a way to tell wget how to name an output file (i.e., not what it is named by the site from which I am retrieving). Thanks, Alan
Re: php form
I see now that I should not have had the single quotes (') around country=US. - Original Message - From: Tony Lewis To: 'Alan Thomas' ; wget@sunsite.dk Sent: Thursday, February 22, 2007 8:59 PM Subject: RE: php form The table stuff just affects what's shown on the user's screen. It's the input field that affects what goes to the server; in this case, that's input ... name=country ... so you want to post country=US. If there were multiple fields, you would separate them with ampersands such as country=USstate=CA. Tony -- From: Alan Thomas [mailto:[EMAIL PROTECTED] Sent: Thursday, February 22, 2007 5:27 PM To: Tony Lewis; wget@sunsite.dk Subject: Re: php form Tony, Thanks. I have to log in with username/password, and I think I know how to do that with wget using POST. For the actual search page, the HTML source says it`s: form action=full_search.php method=POST However, I`m not clear on how to convey the data for the search. The search for has defined a table. One of the entries, for example, is: tr tdbfont face=ArialSearch by Country:/font/b/td tdinput type=text name=country size=50 maxlength=100/td /tr If I want to use wget to search for entries in the U.S. (US), then how do I convey this when I post to the php? Thanks, Alan - Original Message - From: Tony Lewis To: 'Alan Thomas' ; wget@sunsite.dk Sent: Thursday, February 22, 2007 12:53 AM Subject: RE: php form Look for form action=some-web-page method=XXX ... action tells you where the form fields are sent. method tells you if the server is expecting the data to be sent using a GET or POST command; GET is the default. In the case of GET, the arguments go into the URL. If method is POST, follow the instructions in the manual. Hope that helps. Tony From: Alan Thomas [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 21, 2007 4:39 PM To: wget@sunsite.dk Subject: php form There is a database on a web server (to which I have access) that is accessible via username/password. The only way for users to access the database is to use a form with search criteria and then press a button that starts a php script that produces a web page with the results of the search. I have a couple of questions: 1. Is there any easy way to know exactly what commands are behind the button, to duplicate them? 2. If so, then do I just use the POST command as described in the manual, after logging in (per the manual), to get the data it provides. I have used wget just a little, but I am completely new to php. Thanks, Alan
Re: php form
Tony, After logging in (saving the cookies), following your instructions and the manual, I used the following commands: wget --load-cookies cookies.txt --post-data 'country=US' http://www.xxx.yyy/search/full_search.php The data was not filtered using county=US, but rather all of the data without that filter (from all countries) came back. Also, this data was in a .php file. Do you know what I am doing wrong?Thanks, Alan - Original Message - From: Tony Lewis To: 'Alan Thomas' ; wget@sunsite.dk Sent: Thursday, February 22, 2007 8:59 PM Subject: RE: php form The table stuff just affects what's shown on the user's screen. It's the input field that affects what goes to the server; in this case, that's input ... name=country ... so you want to post country=US. If there were multiple fields, you would separate them with ampersands such as country=USstate=CA. Tony From: Alan Thomas [mailto:[EMAIL PROTECTED] Sent: Thursday, February 22, 2007 5:27 PM To: Tony Lewis; wget@sunsite.dk Subject: Re: php form Tony, Thanks. I have to log in with username/password, and I think I know how to do that with wget using POST. For the actual search page, the HTML source says it`s: form action=full_search.php method=POST However, I`m not clear on how to convey the data for the search. The search for has defined a table. One of the entries, for example, is: tr tdbfont face=ArialSearch by Country:/font/b/td tdinput type=text name=country size=50 maxlength=100/td /tr If I want to use wget to search for entries in the U.S. (US), then how do I convey this when I post to the php? Thanks, Alan - Original Message - From: Tony Lewis To: 'Alan Thomas' ; wget@sunsite.dk Sent: Thursday, February 22, 2007 12:53 AM Subject: RE: php form Look for form action=some-web-page method=XXX ... action tells you where the form fields are sent. method tells you if the server is expecting the data to be sent using a GET or POST command; GET is the default. In the case of GET, the arguments go into the URL. If method is POST, follow the instructions in the manual. Hope that helps. Tony -- From: Alan Thomas [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 21, 2007 4:39 PM To: wget@sunsite.dk Subject: php form There is a database on a web server (to which I have access) that is accessible via username/password. The only way for users to access the database is to use a form with search criteria and then press a button that starts a php script that produces a web page with the results of the search. I have a couple of questions: 1. Is there any easy way to know exactly what commands are behind the button, to duplicate them? 2. If so, then do I just use the POST command as described in the manual, after logging in (per the manual), to get the data it provides. I have used wget just a little, but I am completely new to php. Thanks, Alan
Re: php form
Tony, Thanks. I have to log in with username/password, and I think I know how to do that with wget using POST. For the actual search page, the HTML source says it`s: form action=full_search.php method=POST However, I`m not clear on how to convey the data for the search. The search for has defined a table. One of the entries, for example, is: tr tdbfont face=ArialSearch by Country:/font/b/td tdinput type=text name=country size=50 maxlength=100/td /tr If I want to use wget to search for entries in the U.S. (US), then how do I convey this when I post to the php? Thanks, Alan - Original Message - From: Tony Lewis To: 'Alan Thomas' ; wget@sunsite.dk Sent: Thursday, February 22, 2007 12:53 AM Subject: RE: php form Look for form action=some-web-page method=XXX ... action tells you where the form fields are sent. method tells you if the server is expecting the data to be sent using a GET or POST command; GET is the default. In the case of GET, the arguments go into the URL. If method is POST, follow the instructions in the manual. Hope that helps. Tony -- From: Alan Thomas [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 21, 2007 4:39 PM To: wget@sunsite.dk Subject: php form There is a database on a web server (to which I have access) that is accessible via username/password. The only way for users to access the database is to use a form with search criteria and then press a button that starts a php script that produces a web page with the results of the search. I have a couple of questions: 1. Is there any easy way to know exactly what commands are behind the button, to duplicate them? 2. If so, then do I just use the POST command as described in the manual, after logging in (per the manual), to get the data it provides. I have used wget just a little, but I am completely new to php. Thanks, Alan
php form
There is a database on a web server (to which I have access) that is accessible via username/password. The only way for users to access the database is to use a form with search criteria and then press a button that starts a php script that produces a web page with the results of the search. I have a couple of questions: 1. Is there any easy way to know exactly what commands are behind the button, to duplicate them? 2. If so, then do I just use the POST command as described in the manual, after logging in (per the manual), to get the data it provides. I have used wget just a little, but I am completely new to php. Thanks, Alan
Re: Re: local HTML files
Thanks! - Original Message - From: Hrvoje Niksic [EMAIL PROTECTED] To: Alan Thomas [EMAIL PROTECTED] Cc: wget@sunsite.dk Sent: Friday, April 29, 2005 6:16 AM Subject: Re: local HTML files Alan Thomas [EMAIL PROTECTED] writes: Can I somehow give wget an HTML file's local hard disk location vice a URL and have it retrieve files at URLs referenced in that HTML file? If I understand you correctly, it would be: wget --force-html -i file
local HTML files
Can I somehow give wget an HTML file's local hard disk location vice a URL and have it retrieve files at URLs referenced in that HTML file? Thanks, Alan
Re: frames
It doesn`t seem to download the file when I use the debug option. It just quickly says "finished." Hrvoje Niksic [EMAIL PROTECTED] wrote: Alan Thomas <[EMAIL PROTECTED]>writes: The log file looks like: 17:54:41 URL:https://123.456.89.01/blabla.nsf/HOBART?opeNFRAMESET [565/565] - "123.456.89.01/blabla.nsf/HOBART?opeNFRAMESET.html" [1] FINISHED --17:54:41-- Downloaded: 565 bytes in 1 filesThat's not a debug log. You get a debug log by specifying "-d" alongwith other Wget options. Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard.
Re: frames
The log file looks like: 17:54:41 URL:https://123.456.89.01/blabla.nsf/HOBART?opeNFRAMESET [565/565] - "123.456.89.01/blabla.nsf/HOBART?opeNFRAMESET.html" [1] FINISHED --17:54:41--Downloaded: 565 bytes in 1 filesHrvoje Niksic [EMAIL PROTECTED] wrote: Alan Thomas <[EMAIL PROTECTED]>writes: I use Internet Explorer. I disabled Active Scripting and Scripting of Java Applets, but I can still access this page normally (even after a restart).Then the problem is probably not _javascript_-related after all. Adebug log might help see where the problem lies. (When accessingIntranet sites, be careful to erase the sensitive information, if any,before sending the log.) Do you Yahoo!? Plan great trips with Yahoo! Travel: Now over 17,000 guides!
Re: frames
That's probably it. Is there anything I can do to automatically get thefiles with wget? Thanks, Alan- Original Message - From: "Hrvoje Niksic" [EMAIL PROTECTED]To: "Alan Thomas" [EMAIL PROTECTED]Cc: wget@sunsite.dkSent: Friday, April 15, 2005 7:23 PMSubject: [spam] Re: frames "Alan Thomas" [EMAIL PROTECTED] writes: A website uses frames, and when I view it in Explorer, it has the URL https://123.456.89.01/blabla.nsf/HOBART?opeNFRAMESET and a bunch of PDF files in two of the frames. When I try to recursively download this web site, I don`t get the files. [...] Does wget work with frames? Do I need to do something different? Wget should work with frames. Maybe the site is using _javascript_ to load the frame contents? Do you Yahoo!? Make Yahoo! your home page
re: frames
I use Internet Explorer. I disabled Active Scripting and Scripting of JavaApplets, but I can still access this page normally (even after a restart). Do you Yahoo!? Plan great trips with Yahoo! Travel: Now over 17,000 guides!
frames
A website uses frames, and when I view it in Explorer, it has the URL https://123.456.89.01/blabla.nsf/HOBART?opeNFRAMESETand a bunch of PDF files in two of the frames. When I try to recursively download this web site, I don`t get the files. I am using the following command: wget -nc -x -r -l10 -p -E -np -t10 -k -o frames.log -nv -A*.* -H "https://123.456.89.01/blabla.nsf/HOBART?opeNFRAMESET" The log file looks like: 17:54:41 URL:https://123.456.89.01/blabla.nsf/HOBART?opeNFRAMESET [565/565] - "123.456.89.01/blabla.nsf/HOBART?opeNFRAMESET.html" [1] FINISHED --17:54:41--Downloaded: 565 bytes in 1 files Does wget work with frames? Do I need to do something different? Thanks, Alan
Re: [unclassified] Re: newbie question
I got the wgetgui program, and used it successfully. The commands were very much like this one. Thanks, Alan - Original Message - From: Technology Freak [EMAIL PROTECTED] To: Alan Thomas [EMAIL PROTECTED] Sent: Thursday, April 14, 2005 10:12 AM Subject: [unclassified] Re: newbie question Alan, You could try something like this wget -r -d -l1 -H -t1 -nd -N -np -A pdf URL On Wed, 13 Apr 2005, Alan Thomas wrote: Date: Wed, 13 Apr 2005 16:02:40 -0400 From: Alan Thomas [EMAIL PROTECTED] To: wget@sunsite.dk Subject: newbie question I am having trouble getting the files I want using a wildcard specifier (-A option = accept list). The following command works fine to get an individual file: wget https://164.224.25.30/FY06.nsf/($reload)/85256F8A00606A1585256F900040A32F/$FILE/160RDTEN_FY06PB.pdf However, I cannot get all PDF files this command: wget -A *.pdf https://164.224.25.30/FY06.nsf/($reload)/85256F8A00606A1585256F900040A32F/$FILE/ --- TekPhreak [EMAIL PROTECTED] http://www.tekphreak.com