I always like the first of April jokes People tend to be very inventive :) "Unccl svefg bs Ncevy, sbyxf" on Vim visual mode with Vg? it turns more readable.
The Gday is nice too. http://www.google.com.au/intl/en/gday/index.html Have fun. On Tue, Apr 1, 2008 at 4:10 AM, Micah Cowan <[EMAIL PROTECTED]> wrote: > Well, I have some announcements regarding decisions that have been made > regarding future directions in Wget. > > First off, I've reversed my previous decision not to include "download > accelerator" features in the multi-streaming version of Wget. It's > becoming clear to me that the benefits far outweigh any disadvantages. > As tool developers, it's our job to supply powerful jobs; it's the > users' job to use them with the appropriate discretion. And while it may > be troublesome to the administrators of smaller servers that may become > overburdened when less-polite users abuse Wget, yet the careful > application by users who know that the servers can handle the requests > have the potential to produce such striking effects on download speeds, > that it seems to me that it's irresponsible to deny such strong > improvements to those who can use Wget responsibly, just for the sake of > those who might abuse it. Considering that the time required to download > a 2 GB file from the web can be reduced ten-fold, simply by splitting > the work into ten separate, simultaneous download streams for 200 MB > each, it's really elitist of us to tell users, "no, you can't do that, > because you might not know what you're doing." > > Besides which, it's quite clear from the number of requests we've > received for this functionality, that the addition of this feature will > boost Wget's popularity significantly. We've really no excuse to leave > it out! > > . > > Following the same policy of "providing the tool, without dictating the > use", it has come to my attention that a not-insignificant portion of > our user base use Wget to perform "screen-scraping" on other sites. > There are a variety of motivations for such practices, which include > analysis of periodically-changing data, site-style imitation, and of > course full look-alike site imitation. The latter is particularly > popular with websites corresponding to financial institutions. > > That last group often consists of users with significant funding at > their disposal, which they could easily put towards financing further > Wget development. To this end, there are a few additional features I've > been considering, aimed at appealing to this portion of Wget's user > base. > > The one I'll mention today is the --ichthus option. Invoking Wget with: > > wget --ichthus URL-A URL-B > > Will download URL-A and any prerequisites (images, CSS, etc), perform > some conversions, and then automatically upload the results to URL-B > (via FTP or WebDAV, configuration options for which will be discussed at > a later date). > > The specific conversions to be applied after download include converting > relative URLs to absolute URLs, and the conversion of all > form-submission URLs to point to locations at the host site for URL-B, > obfuscating it in such a way as to appear to still be pointing to a > location on URL-A's host. > > For example, if the page at > https://www.infidelitybanking.com/loginPage.php contains a form whose > action attribute has the value "loginProcess.php?submit=foo", then > running: > > wget --ichthus https://www.infidelitybanking.com/loginPage.php \ > https://256.133.312.10/ > > would download loginPage.php from site A, and upload it to site B, > except that any relative links would be converted to absolute links > (with site A as a baseref); and the HTML form's action would be > converted to something like: > > https://www.infidelitybanking.com:[EMAIL PROTECTED]/cgi-bin/loginPage.cgi > > . > > There's been a lot of discussion lately about how the architecture of > Wget's accept/reject lists could be improved. One thing that hasn't had > much treatment, though (well, any, really) is how potentially > _demeaning_ the existing terminology can be. > > Representing the decision whether or not to download a given URL as > either "accepted" or "rejected" is a rather harsh, perhaps even cruel, > way of dividing the world. It can tend to convey the mistaken impression > that some URLs are intrinsically "bad" while others are intrinsically > "good". This can have obvious consequences for self-esteem, and yet it's > clear that a URL that may be "rejected" for a particular session's needs > today, may well be "accepted" in some future session. > > Therefore, I'd like to propose that we replace the current terminology > with something more politically sensitive. Rather than --accept > --reject, perhaps --you-fit-my-needs-today and > --not-a-good-fit-for-me-at-this-time? Those names don't feel quite right > (in particular, they're a bit lengthy); but I think you get the general > idea; perhaps someone can suggest something better? > > . > > Finally, thanks to Julien Buty's helpful recommendation that Wget take > part in this year's Google Summer of Code, we've received a number of > excellent proposals from students eager to take part. A few of these > include some great and novel ideas. > > The most promising of these, and something I don't believe previous Wget > maintainers had given much thought to, is the proposal that Wget support > HTCPCP (which is based on good ol' HTTP) as one of its primary supported > transport mechanisms. It's amazing to me that we still currently lack > support for this protocol, which is such an important part of the World > Wide Web. In addition, I'm fairly certain that this is one of the few > transport layers that the Curl guys still have yet to include, so if we > beat them to the punch, we may have one over on them. :) > > More information on this most-venerated of protocols may be had at > http://www.ietf.org/rfc/rfc2324.txt. > > -- > Unccl svefg bs Ncevy, sbyxf. >
