Well, I have some announcements regarding decisions that have been made regarding future directions in Wget.
First off, I've reversed my previous decision not to include "download accelerator" features in the multi-streaming version of Wget. It's becoming clear to me that the benefits far outweigh any disadvantages. As tool developers, it's our job to supply powerful jobs; it's the users' job to use them with the appropriate discretion. And while it may be troublesome to the administrators of smaller servers that may become overburdened when less-polite users abuse Wget, yet the careful application by users who know that the servers can handle the requests have the potential to produce such striking effects on download speeds, that it seems to me that it's irresponsible to deny such strong improvements to those who can use Wget responsibly, just for the sake of those who might abuse it. Considering that the time required to download a 2 GB file from the web can be reduced ten-fold, simply by splitting the work into ten separate, simultaneous download streams for 200 MB each, it's really elitist of us to tell users, "no, you can't do that, because you might not know what you're doing." Besides which, it's quite clear from the number of requests we've received for this functionality, that the addition of this feature will boost Wget's popularity significantly. We've really no excuse to leave it out! . Following the same policy of "providing the tool, without dictating the use", it has come to my attention that a not-insignificant portion of our user base use Wget to perform "screen-scraping" on other sites. There are a variety of motivations for such practices, which include analysis of periodically-changing data, site-style imitation, and of course full look-alike site imitation. The latter is particularly popular with websites corresponding to financial institutions. That last group often consists of users with significant funding at their disposal, which they could easily put towards financing further Wget development. To this end, there are a few additional features I've been considering, aimed at appealing to this portion of Wget's user base. The one I'll mention today is the --ichthus option. Invoking Wget with: wget --ichthus URL-A URL-B Will download URL-A and any prerequisites (images, CSS, etc), perform some conversions, and then automatically upload the results to URL-B (via FTP or WebDAV, configuration options for which will be discussed at a later date). The specific conversions to be applied after download include converting relative URLs to absolute URLs, and the conversion of all form-submission URLs to point to locations at the host site for URL-B, obfuscating it in such a way as to appear to still be pointing to a location on URL-A's host. For example, if the page at https://www.infidelitybanking.com/loginPage.php contains a form whose action attribute has the value "loginProcess.php?submit=foo", then running: wget --ichthus https://www.infidelitybanking.com/loginPage.php \ https://256.133.312.10/ would download loginPage.php from site A, and upload it to site B, except that any relative links would be converted to absolute links (with site A as a baseref); and the HTML form's action would be converted to something like: https://www.infidelitybanking.com:[EMAIL PROTECTED]/cgi-bin/loginPage.cgi . There's been a lot of discussion lately about how the architecture of Wget's accept/reject lists could be improved. One thing that hasn't had much treatment, though (well, any, really) is how potentially _demeaning_ the existing terminology can be. Representing the decision whether or not to download a given URL as either "accepted" or "rejected" is a rather harsh, perhaps even cruel, way of dividing the world. It can tend to convey the mistaken impression that some URLs are intrinsically "bad" while others are intrinsically "good". This can have obvious consequences for self-esteem, and yet it's clear that a URL that may be "rejected" for a particular session's needs today, may well be "accepted" in some future session. Therefore, I'd like to propose that we replace the current terminology with something more politically sensitive. Rather than --accept --reject, perhaps --you-fit-my-needs-today and --not-a-good-fit-for-me-at-this-time? Those names don't feel quite right (in particular, they're a bit lengthy); but I think you get the general idea; perhaps someone can suggest something better? . Finally, thanks to Julien Buty's helpful recommendation that Wget take part in this year's Google Summer of Code, we've received a number of excellent proposals from students eager to take part. A few of these include some great and novel ideas. The most promising of these, and something I don't believe previous Wget maintainers had given much thought to, is the proposal that Wget support HTCPCP (which is based on good ol' HTTP) as one of its primary supported transport mechanisms. It's amazing to me that we still currently lack support for this protocol, which is such an important part of the World Wide Web. In addition, I'm fairly certain that this is one of the few transport layers that the Curl guys still have yet to include, so if we beat them to the punch, we may have one over on them. :) More information on this most-venerated of protocols may be had at http://www.ietf.org/rfc/rfc2324.txt. -- Unccl svefg bs Ncevy, sbyxf.