Re: Ignoring robots.txt [was Re: wget default behavior...]

2007-10-17 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Tony Godshall wrote: ... Perhaps it should be one of those things that one can do oneself if one must but is generally frowned upon (like making a version of wget that ignores robots.txt). Damn. I was only joking about ignoring robots.txt,

Re: Ignoring robots.txt [was Re: wget default behavior...]

2007-10-17 Thread Tony Godshall
Tony Godshall wrote: ... Perhaps it should be one of those things that one can do oneself if one must but is generally frowned upon (like making a version of wget that ignores robots.txt). Damn. I was only joking about ignoring robots.txt, but now I'm thinking[1] there may be good

Re: Man pages [Re: ignoring robots.txt]

2007-07-20 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Christopher G. Lewis wrote: Micah et al. - Just for an FYI - the whole texi-info, texi-html and (texi-rtf-hlp) is *very* fragile in the windows world. You actually have to download a *very* old version of makeinfo (1.68, not even on

Re: ignoring robots.txt

2007-07-19 Thread Daniel Stenberg
On Wed, 18 Jul 2007, Micah Cowan wrote: The manpage doesn't need to give as detailed explanations as the info manual (though, as it's auto-generated from the info manual, this could be hard to avoid); but it should fully describe essential features. I know GNU projects for some reason go

Re: ignoring robots.txt

2007-07-19 Thread Andreas Pettersson
Daniel Stenberg wrote: On Wed, 18 Jul 2007, Micah Cowan wrote: The manpage doesn't need to give as detailed explanations as the info manual (though, as it's auto-generated from the info manual, this could be hard to avoid); but it should fully describe essential features. I know GNU

Man pages [Re: ignoring robots.txt]

2007-07-19 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Daniel Stenberg wrote: On Wed, 18 Jul 2007, Micah Cowan wrote: The manpage doesn't need to give as detailed explanations as the info manual (though, as it's auto-generated from the info manual, this could be hard to avoid); but it should fully

RE: Man pages [Re: ignoring robots.txt]

2007-07-19 Thread Christopher G. Lewis
recall off the top of my head). So if it has to go away, so be it. Christopher G. Lewis http://www.ChristopherLewis.com -Original Message- From: Micah Cowan [mailto:[EMAIL PROTECTED] Sent: Thursday, July 19, 2007 1:16 PM To: WGET@sunsite.dk Subject: Man pages [Re: ignoring robots.txt

Re: ignoring robots.txt

2007-07-18 Thread Maciej W. Rozycki
On Wed, 18 Jul 2007, Josh Williams wrote: Is there any particular reason we don't have an option to ignore robots.txt? There is no particular reason, so we do. Maciej

Re: ignoring robots.txt

2007-07-18 Thread Josh Williams
On 7/18/07, Maciej W. Rozycki [EMAIL PROTECTED] wrote: There is no particular reason, so we do. As far as I can tell, there's nothing in the man page about it.

Re: ignoring robots.txt

2007-07-18 Thread Steven M. Schweda
From: Josh Williams As far as I can tell, there's nothing in the man page about it. It's pretty well hidden. -e robots=off At this point, I normally just grind my teeth instead of complaining about the differences between the command-line options and the commands in the .wgetrc

Re: ignoring robots.txt

2007-07-18 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Steven M. Schweda wrote: From: Josh Williams As far as I can tell, there's nothing in the man page about it. It's pretty well hidden. -e robots=off At this point, I normally just grind my teeth instead of complaining about the

RE: ignoring robots.txt

2007-07-18 Thread Tony Lewis
Micah Cowan wrote: The manpage doesn't need to give as detailed explanations as the info manual (though, as it's auto-generated from the info manual, this could be hard to avoid); but it should fully describe essential features. I can't see any good reason for one set of documentation to be

Re: ignoring robots.txt

2007-07-18 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Tony Lewis wrote: Micah Cowan wrote: The manpage doesn't need to give as detailed explanations as the info manual (though, as it's auto-generated from the info manual, this could be hard to avoid); but it should fully describe essential

RE: ignoring robots.txt

2007-07-18 Thread Tony Lewis
Micah Cowan wrote: Don't we already follow typical etiquette by default? Or do you mean that to override non-default settings in the rcfile or whatnot? We don't automatically use a --wait time between requests. I'm not sure what other nice options we'd want to make easily available, but there

Re: ignoring robots.txt

2007-07-18 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Tony Lewis wrote: Micah Cowan wrote: Don't we already follow typical etiquette by default? Or do you mean that to override non-default settings in the rcfile or whatnot? We don't automatically use a --wait time between requests. I'm not

Re: ignoring robots.txt

2007-07-18 Thread Hrvoje Niksic
Micah Cowan [EMAIL PROTECTED] writes: I think we should either be a stub, or a fairly complete manual (and agree that the latter seems preferable); nothing half-way between: what we have now is a fairly incomplete manual. Converting from Info to man is harder than it may seem. The script

Man pages [Re: ignoring robots.txt]

2007-07-18 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hrvoje Niksic wrote: Micah Cowan [EMAIL PROTECTED] writes: I think we should either be a stub, or a fairly complete manual (and agree that the latter seems preferable); nothing half-way between: what we have now is a fairly incomplete manual.

Re: Man pages [Re: ignoring robots.txt]

2007-07-18 Thread Hrvoje Niksic
Micah Cowan [EMAIL PROTECTED] writes: Converting from Info to man is harder than it may seem. The script that does it now is basically a hack that doesn't really work well even for the small part of the manual that it tries to cover. I'd noticed. :) I haven't looked at the script that