Re: wget url with hash # issue
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Aram Wool wrote: Hi, I'm having trouble retrieving an mp3 file from a url of the form http://www.websitename.com/HTML/typo3conf/ext/naksci_synd/mod1/index.php?mode=LATESTpid=13recursive=255feeduid=1feed=Normaluser=8hash=d84a36bbaa1906cc07007557c6b60395 entering this url in a browser opens the 'save as' dialogue box for the mp3, but the file isn't found if wget is used instead. Well, since the above URL doesn't point to any real resource, we can't really track down what problems you may be having. Also, the URL doesn't seem to have anything to do with the subject of your message, which mentions a hash # (unless you mean hash number, the last parameter in the query string; that's ambiguous, because the # itself is often called a hash mark). Since you haven't given us enough information to help you, I can only hazard a wide guess, and wonder if the site might be explicitly blocking wget, in which case you can use the --user-agent option to trick it (try a value like 'Mozilla', or emulate whatever your browser sends). Also, is it possible to add an asterik to a url so as to indicate that wget should ignore the characters before or after it? I really don't understand what you're asking for here. If you want Wget to ignore the characters you've specified, why specify them in the first place? If you mean that you want Wget to find any file that matches that wildcard, well no: Wget can do that for FTP, which supports directory listings; it can't do that for HTTP, which has no means for listing files in a directory (unless it has been extended, for example with WebDAV, to do so). - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG362l7M8hyUobTrERCJ+RAJ9BWXs6d8VAZyOf5ozaozokUEptRACeOR0J ET5Ur9UdFWTKzQtYjPM6Pg4= =Y4xe -END PGP SIGNATURE-
RE: wget url with hash # issue
Micah Cowan wrote: If you mean that you want Wget to find any file that matches that wildcard, well no: Wget can do that for FTP, which supports directory listings; it can't do that for HTTP, which has no means for listing files in a directory (unless it has been extended, for example with WebDAV, to do so). Seems to me that is a big unless because we've all seen lots of websites that have http directory listings. Apache will do it out of the box (and by default) if there is no index.htm[l] file in the directory. Perhaps we could have a feature to grab all or some of the files in a HTTP directory listing. Maybe something like this could be made to work: wget http://www.exelana.com/images/mc*.gif Perhaps we would need an option such as --http-directory (the first thing that came to mind, but not necessarily the most intuitive name for the option) to explicitly tell wget how it is expected to behave. Or perhaps it can just try stripping the filename when doing an http request and wildcards are specified. At any rate (with or without the command line option), wget would retrieve http://www.exelana.com/images/ and then retrieve any links where the target matches mc*.gif. If wget is going to explicitly support http directory listings, it probably needs to be intelligent enough to ignore the sorting options. In the case of Apache, that would be things like A HREF=?N=DName/A. Anyone have any idea how many different http directory listing formats are out there? Tony
Re: Myriad merges
Zitat von Jochen Roderburg [EMAIL PROTECTED]: So it looks now to me, that the new error (local timestamp not set to remote) only occurs in the cases when no HEAD is used. This (new) piece of code in http.c (line 2666 ff.) looks very suspicious to me, especially the time_came_from_head bit: /* Reparse time header, in case it's changed. */ if (time_came_from_head hstat.remote_time hstat.remote_time[0]) { newtmr = http_atotm (hstat.remote_time); if (newtmr != -1) tmr = newtmr; } Other than that I have used the current svn version now a few days more with all my work and I would say all the issues that had bothered me in the recent development cycles are corrected now. I'll see, however, that I can make a few more systematic tests with some combination of the relevant options which I usually do not use in my practice. What I have seen new are some cosmetic issues in the program output when HTTP restarts happen. Such restarts are normally rare these days, but I have some sites far away where suddenly bad connections and timeouts reappeared. One looks pretty simple, I think I can prepare a patch myself on the weekend when I have access to my Linux development system at home again. I'll report details in separate mail later, when I have examples for the cases. Best regards, Jochen Roderburg ZAIK/RRZK University of Cologne Robert-Koch-Str. 10 Tel.: +49-221/478-7024 D-50931 Koeln E-Mail: [EMAIL PROTECTED] Germany
Re: Myriad merges
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Jochen Roderburg wrote: Zitat von Jochen Roderburg [EMAIL PROTECTED]: So it looks now to me, that the new error (local timestamp not set to remote) only occurs in the cases when no HEAD is used. This (new) piece of code in http.c (line 2666 ff.) looks very suspicious to me, especially the time_came_from_head bit: /* Reparse time header, in case it's changed. */ if (time_came_from_head hstat.remote_time hstat.remote_time[0]) { newtmr = http_atotm (hstat.remote_time); if (newtmr != -1) tmr = newtmr; } The intent behind this code is to ensure that we parse the Last-Modified date again, even if we already parsed Last-Modified, if the last one we parsed came from the HEAD. This whole block of code that you've pasted is new, not just the surrounding if clause; if we never sent a HEAD but only a GET, the Last-Modified _should_ have been parsed in code that appears before here. ...but, obviously, things aren't working quite as they should, so I need to look into it more closely. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG4DD77M8hyUobTrERCFf0AJ9MVT0+eTCidH63YTBuHKrXTmA+3QCeIzav x1bSxRx1I3I1eXnvz8Pv384= =EfI4 -END PGP SIGNATURE-
Re: wget syntax problem ?
On 9/6/07, Alan Thomas [EMAIL PROTECTED] wrote: I know this is probably something simple I screwed up, but the following commands in a Windows batch file return the error Bad command or file name for the wget command cd .. wget --convert-links --directory-prefix=C:\WINDOWS\Profiles\Alan000\Desktop\wget\CNN\ --no-clobber http://www.cnn.com; Don't use backslashes in filenames. If you do, use `\\` instead.
Re: wget syntax problem ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Alan Thomas wrote: command.com By the way, Josh and your messages are being put out to the list in dupicates (at least, that`s what I`m seeing on my end). Not really; we've been Cc'ing you. I don't think we knew whether you were subscribed or not, and so Cc'd you in case you weren't. Also, many of us just habitually hit Reply All to hit the message, so we don't accidentally send it to the message's author only. :) - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG4Kys7M8hyUobTrERCCK4AJ9rOGMPa1Xcl/evqENs6pmN7AAncACfeWhd nyC+OzJ3ME7vMqRsEoVNP68= =n6JC -END PGP SIGNATURE-
Re: wget syntax problem ?
command.com By the way, Josh and your messages are being put out to the list in dupicates (at least, that`s what I`m seeing on my end). - Original Message - From: Micah Cowan [EMAIL PROTECTED] To: Alan Thomas [EMAIL PROTECTED] Cc: wget@sunsite.dk Sent: Thursday, September 06, 2007 9:34 PM Subject: Re: wget syntax problem ? -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Alan Thomas wrote: Please ignore. It was needing the \\, like Josh said. Out of curiosity, what command interpreter were you using? Was this command.com, or something else like rxvt/Cygwin? - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD4DBQFG4Kqe7M8hyUobTrERCI3HAJjw+g0GsGE1b+6vhr+pu/QJAQIuAJ4o2UbP e3qqbx+ywsdRpTuIbx6VPQ== =792z -END PGP SIGNATURE-
Re: wget syntax problem ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Alan Thomas wrote: Please ignore. It was needing the \\, like Josh said. Out of curiosity, what command interpreter were you using? Was this command.com, or something else like rxvt/Cygwin? - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD4DBQFG4Kqe7M8hyUobTrERCI3HAJjw+g0GsGE1b+6vhr+pu/QJAQIuAJ4o2UbP e3qqbx+ywsdRpTuIbx6VPQ== =792z -END PGP SIGNATURE-
Re: wget syntax problem ?
Please ignore. It was needing the \\, like Josh said. - Original Message - From: Alan Thomas [EMAIL PROTECTED] To: Josh Williams [EMAIL PROTECTED]; wget@sunsite.dk Sent: Thursday, September 06, 2007 9:25 PM Subject: Re: wget syntax problem ? Wget does not like my use of the --directory-prefix= option. Anyone know why? - Original Message - From: Josh Williams [EMAIL PROTECTED] To: Alan Thomas [EMAIL PROTECTED] Cc: wget@sunsite.dk Sent: Thursday, September 06, 2007 8:53 PM Subject: Re: wget syntax problem ? On 9/6/07, Alan Thomas [EMAIL PROTECTED] wrote: I know this is probably something simple I screwed up, but the following commands in a Windows batch file return the error Bad command or file name for the wget command cd .. wget --convert-links --directory-prefix=C:\WINDOWS\Profiles\Alan000\Desktop\wget\CNN\ --no-clobber http://www.cnn.com; Don't use backslashes in filenames. If you do, use `\\` instead.
Re: wget syntax problem ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Alan Thomas wrote: I know this is probably something simple I screwed up, but the following commands in a Windows batch file return the error Bad command or file name for the wget command It sounds to me like you don't have wget in your PATH. Make sure that wget is located somewhere where command.com (or whatever) can find it. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG4Kki7M8hyUobTrERCCG9AJ90dQ95sGaqEwVyH7KOZQxwlL7xCQCfWeJz v9aCRAPhJp3kqZtd6zS0KNs= =IAsR -END PGP SIGNATURE-
Re: wget syntax problem ?
Wget does not like my use of the --directory-prefix= option. Anyone know why? - Original Message - From: Josh Williams [EMAIL PROTECTED] To: Alan Thomas [EMAIL PROTECTED] Cc: wget@sunsite.dk Sent: Thursday, September 06, 2007 8:53 PM Subject: Re: wget syntax problem ? On 9/6/07, Alan Thomas [EMAIL PROTECTED] wrote: I know this is probably something simple I screwed up, but the following commands in a Windows batch file return the error Bad command or file name for the wget command cd .. wget --convert-links --directory-prefix=C:\WINDOWS\Profiles\Alan000\Desktop\wget\CNN\ --no-clobber http://www.cnn.com; Don't use backslashes in filenames. If you do, use `\\` instead.
Re: wget syntax problem ?
On 9/6/07, Micah Cowan [EMAIL PROTECTED] wrote: Not really; we've been Cc'ing you. I don't think we knew whether you were subscribed or not, and so Cc'd you in case you weren't. Also, many of us just habitually hit Reply All to hit the message, so we don't accidentally send it to the message's author only. :) aye. Gmail doesn't have that problem, though. If it finds a duplicate message from a mailing list, it only shows me the one from the list. Kind of nice.
Re: Files returned by ASP
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Alan Thomas wrote: Is there a way to use wget to get file from links that result from Active Server Pages (ASPs) on a web page? For example, to get the files in the links on the page returned by the URL http://www.onr.navy.mil/about/conferences/rd_partner/2007/presentations_03.asp. Thanks, Alan Sure, check out what the Wget manual has to say about recursive fetching: http://www.gnu.org/software/wget/manual/html_node/Recursive-Download.html#Recursive-Download - -- HTH, Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG4LAu7M8hyUobTrERCDtAAJ4ub4sh17gMv8kzK6F/p69C2HBrFQCgiLHc zidjMSZuCQI/j0TkKxWd24M= =kNgI -END PGP SIGNATURE-
Announcing... The Wget Wgiki!
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 The main informational site for GNU Wget is now at http://wget.addictivecode.com/; the Wget Wgiki. -- The original motivation for starting a wiki for Wget was that I needed a forum for collaboration on specifications and design for future features in Wget, and particularly in what we've been calling Wget 2.0, the next generation of Wget. Features that have been (tentatively) suggested or planned for Wget 2.0 include: * Support for multiple connections simultaneously * Configuration options on a per-host and/or per-initial URI subpath basis. * Accept/reject (and others) based on MIME type. * Support for the use of regular expressions. * A recursive-fetch metadatabase, to save download information such as mappings between local filenames and originating URIs, MIME types, HTTP entity identifiers, etc. * A plugin architecture. * Support for parsing of non-HTML files for links to follow. * Support for handing-off specific HTML elements to plugins for special handling * Support for extending Wget with new protocols * Better encapsulation of the file-system, to hide local filename restrictions and such from the download logic. * Support for Internationalized Resource Identifiers (IRIs). * Some level of JavaScript support ** * Support for the Metalink format ** ** For various reasons, JavaScript and Metalink support will probably not be part of canonical Wget, but would take advantage of the plugin architecture and be distributed separately from the core Wget source. Development for these features might be separate from core Wget development. Some of these things necessitate a complete restructuring of Wget's logic, very possibly a complete or near-rewrite. It is also possible that the configuration and command-line interface syntaxes would need to be reimagined, in which case a name change for the next generation Wget might begin to show merit. The feature specifications and design discussions for these elements will live at http://wget.addictivecode.org/FeatureSpecifications. I have started a few of them off, most still need to be started, and all need help. -- An aside: I do not want to give the idea that Wget is going to go from a Swiss Army Knife to a Combination Hand-pistol/tank/aircraft-carrier/missile-launch-silo ;) As I see it, Wget's major boons have been its relatively small footprint, it's speed and efficiency, and it's ability to (usually) Do what I want. I do not wish to abandon these things. This was a major factor in the decision to isolate features like Metalink and JavaScript into plugins: with a plugin architecture, if the users /want/ the Combination Hand-pistol/..., they can just load up the tank and missile-silo modules! ;) -- At any rate, I felt that having a wiki for discussion of these things would prove invaluable, so I started work on this last week. But while I was working on these things, it became more and more obvious how much of a benefit it could be in serving as the main repository for even general, non-developer-oriented information for Wget. This is a somewhat abrupt turn from my desire to make the gnu.org site the main source of information about Wget, but I believe it'll be much easier in the long run. Please do check the site out, and help to improve it! Most of content from the old site should have moved to the wiki (the old site has already been updated to direct readers there). - http://wget.addictivecode.org/FeatureSpecifications Home for various features that need sketching out (these are intended to be informal specifications, not particularly rigorous; just enough to know what we are doing). - http://wget.addictivecode.org/Faq The FAQ has been updated somewhat, probably worth looking over. - http://wget.addictivecode.org/TitleIndex We don't have that many pages yet; here's the full list. ;) - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG4N+E7M8hyUobTrERCHQoAKCFRB0HPbWSIBvTrT42clFlYh2p/gCfTzYH h9HCFzSxs4WSNgyFe4OX3A8= =0OBC -END PGP SIGNATURE-