... Perhaps it should be one of those things that one can do
oneself if one must but is generally frowned upon (like making a
version of wget that ignores robots.txt).
Damn. I was only joking about ignoring robots.txt, but now I'm
thinking[1] there may be good reasons to do so... maybe
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Tony Godshall wrote:
... Perhaps it should be one of those things that one can do
oneself if one must but is generally frowned upon (like making a
version of wget that ignores robots.txt).
Damn. I was only joking about ignoring robots.txt
Tony Godshall wrote:
... Perhaps it should be one of those things that one can do
oneself if one must but is generally frowned upon (like making a
version of wget that ignores robots.txt).
Damn. I was only joking about ignoring robots.txt, but now I'm
thinking[1] there may be good
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Christopher G. Lewis wrote:
Micah et al. -
Just for an FYI - the whole texi-info, texi-html and
(texi-rtf-hlp) is *very* fragile in the windows world. You actually
have to download a *very* old version of makeinfo (1.68, not even on
On Wed, 18 Jul 2007, Micah Cowan wrote:
The manpage doesn't need to give as detailed explanations as the info manual
(though, as it's auto-generated from the info manual, this could be hard to
avoid); but it should fully describe essential features.
I know GNU projects for some reason go
Daniel Stenberg wrote:
On Wed, 18 Jul 2007, Micah Cowan wrote:
The manpage doesn't need to give as detailed explanations as the info
manual (though, as it's auto-generated from the info manual, this
could be hard to avoid); but it should fully describe essential
features.
I know GNU
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Daniel Stenberg wrote:
On Wed, 18 Jul 2007, Micah Cowan wrote:
The manpage doesn't need to give as detailed explanations as the info
manual (though, as it's auto-generated from the info manual, this
could be hard to avoid); but it should fully
recall off the top of my head). So
if it has to go away, so be it.
Christopher G. Lewis
http://www.ChristopherLewis.com
-Original Message-
From: Micah Cowan [mailto:[EMAIL PROTECTED]
Sent: Thursday, July 19, 2007 1:16 PM
To: WGET@sunsite.dk
Subject: Man pages [Re: ignoring robots.txt
On Wed, 18 Jul 2007, Josh Williams wrote:
Is there any particular reason we don't have an option to ignore robots.txt?
There is no particular reason, so we do.
Maciej
On 7/18/07, Maciej W. Rozycki [EMAIL PROTECTED] wrote:
There is no particular reason, so we do.
As far as I can tell, there's nothing in the man page about it.
From: Josh Williams
As far as I can tell, there's nothing in the man page about it.
It's pretty well hidden.
-e robots=off
At this point, I normally just grind my teeth instead of complaining
about the differences between the command-line options and the commands
in the .wgetrc
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Steven M. Schweda wrote:
From: Josh Williams
As far as I can tell, there's nothing in the man page about it.
It's pretty well hidden.
-e robots=off
At this point, I normally just grind my teeth instead of complaining
about the
could
--mirror the site while ignoring robots.txt, but even that is legitimate in
many cases.
With regard to user agent, many websites customize their output based on the
browser that is displaying the page. If one does not set user agent to match
their browser, the retrieved content may be very
* spiders crawling through their sites. A well-crafted wget
command that downloads selected information from a site without
regard to the robots.txt restrictions is a very different situation.
It's true that someone could --mirror the site while ignoring
robots.txt, but even that is legitimate in many
Micah Cowan wrote:
Don't we already follow typical etiquette by default? Or do you mean
that to override non-default settings in the rcfile or whatnot?
We don't automatically use a --wait time between requests. I'm not sure what
other nice options we'd want to make easily available, but there
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Tony Lewis wrote:
Micah Cowan wrote:
Don't we already follow typical etiquette by default? Or do you
mean that to override non-default settings in the rcfile or
whatnot?
We don't automatically use a --wait time between requests. I'm not
Micah Cowan [EMAIL PROTECTED] writes:
I think we should either be a stub, or a fairly complete manual
(and agree that the latter seems preferable); nothing half-way
between: what we have now is a fairly incomplete manual.
Converting from Info to man is harder than it may seem. The script
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Hrvoje Niksic wrote:
Micah Cowan [EMAIL PROTECTED] writes:
I think we should either be a stub, or a fairly complete manual
(and agree that the latter seems preferable); nothing half-way
between: what we have now is a fairly incomplete manual.
Micah Cowan [EMAIL PROTECTED] writes:
Converting from Info to man is harder than it may seem. The script
that does it now is basically a hack that doesn't really work well
even for the small part of the manual that it tries to cover.
I'd noticed. :)
I haven't looked at the script that
19 matches
Mail list logo