Re: wget default behavior [was Re: working on patch to limit to "percent of bandwidth"]

2007-10-16 Thread Tony Godshall
On 10/13/07, Micah Cowan <[EMAIL PROTECTED]> wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
>
> > On 10/13/07, Tony Godshall <[EMAIL PROTECTED]> wrote:
> >> OK, so let's go back to basics for a moment.
> >>
> >> wget's default behavior is to use all available bandwidth.
> >>
> >> Is this the right thing to do?
> >>
> >> Or is it better to back off a little after a bit?
>
> Heh. Well, some people are saying that Wget should support "accelerated
> downloads"; several connections to download a single resource, which can
> sometimes give a speed increase at the expense of nice-ness.
>
> So you could say we're at a happy medium between those options! :)
>
> Actually, Wget probably will get support for multiple simultaneous
> connections; but number of connections to one host will be limited to a
> max of two.
>
> It's impossible for Wget to know how much is appropriate to back off,
> and in most situations I can think of, backing off isn't appropriate.
>
> In general, though, I agree that Wget's policy should be "nice by default".

If it was me, I'd have it default to backing off to 95% by default and
have options for more aggressive behavior, like the multiple
connections, etc.

I'm surprised multiple connections would buy you anything, though.  I
guess I'll take a look through the archives and see what the argument
is.  Does one tcp connection back off on a lost packet and the other
one gets to keep going?  Hmmm.

> Josh Williams wrote:
> > That's one of the reasons I believe this
> > should be a module instead, because it's more or less a hack to patch
> > what the environment should be doing for wget, not vice versa.
>
> At this point, since it seems to have some demand, I'll probably put it
> in for 1.12.x; but I may very well move it to a module when we have
> support for that.

Thanks, yes that makes sense.

> Of course, Tony G indicated that he would prefer it to be
> conditionally-compiled, for concerns that the plugin architecture will
> add overhead to the wget binary. Wget is such a lightweight app, though,
> I'm not thinking that the plugin architecture is going to be very
> significant. It would be interesting to see if we can add support for
> some modules to be linked in directly, rather than dynamically; however,
> it'd still probably have to use the same mechanisms as the normal
> modules in order to work. Anyway, I'm sure we'll think about those
> things more when the time comes.

Makes sense.

> Or you could be proactive and start work on
> http://wget.addictivecode.org/FeatureSpecifications/Plugins
> (non-existent, but already linked to from FeatureSpecifications). :)

I'll look into that.

> On 10/14/07, Hrvoje Niksic <[EMAIL PROTECTED]> wrote:
> > "Tony Godshall" <[EMAIL PROTECTED]> writes:
> >
> > > OK, so let's go back to basics for a moment.
> > >
> > > wget's default behavior is to use all available bandwidth.
> >
> > And so is the default behavior of curl, Firefox, Opera, and so on.
> > The expected behavior of a program that receives data over a TCP
> > stream is to consume data as fast as it arrives.

What was your point exactly?  All the other kids do it?

Tony G


Re: ... --limit-rate nn%

2007-10-16 Thread Tony Godshall
On 10/15/07, Matthias Vill <[EMAIL PROTECTED]> wrote:
> Micah Cowan schrieb:
> > Matthias Vill wrote:
> >> I would appreciate having a --limit-rate N% option.
> >
> >> So now about those "broken" cases. You could do some "least of both"
> >> policy (which would of course still need the time to do measuring and
> >> can cut only afterwards).
> >> Or otherwise you could use a non-percent value as a minimum. This would
> >> be especially useful if you add it to your default options and stumble
> >> over some slow server only serving you 5KiB/s, where you most probably
> >> don't want to further lower the speed on your side.
> >
> >> As third approach you would only use the last limiting option.
> >
> >> Depending on how difficult the implementation is I would vote for the
> >> second behavior, although the first or third option might be more
> >> intuitive to some of the users not reading the docs.
> >
> > Third option should be more intuitive to the implementer, too. I vote
> > for that, as I really want to avoid putting too much sophistication into
> > this.
>
> I would expect, that you need to variables for holding percent/fixed
> values anyway so I was just wondering whether you could do it as I
> suggested secondly.
> IMHO that should be quite easy to do by a single
> if(fixed&&percent)limit=max(a,b) and thus not even result in any
> overhead during actual download.
>
> Greetings
>
> Matthias
>
> P.S.: I'm subscribed via news://sunsite.dk, you don't need to CC me.


Thanks for the input, guys.

I'll see what I can do.

About the parser... I'm thinking I can hack the parser that now
handles the K, M, etc. suffixes so it works as it did before but also
sees a '%' suffix as valid- that would reduce the amount of code
necessary to implement --limit-rate nn%.  Any reason not to do so?


-- 
Best Regards.
Please keep in touch.


Re: version.c take two

2007-10-16 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Micah Cowan wrote:
> Gisle and Chris, you should be able to write this rule in your
> Makefiles. Something like:
> 
> version.c: $(SOURCES)
>   echo 'const char *version_string = "@VERSION@"' > $@
>   -hg log -r tip --template='" ({node|short})"\n' >> $@
>   echo ';' >> $@

Actually, the "hg" line ought to be:

-hg log -r . --template='" ({node|short})"\n' >> $@

As the user may not necessarily be building from tip. If closing stderr
or directing it to the equivalent of /dev/null ("2>NULL"?) is possible,
that should also be included.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHFUQu7M8hyUobTrERCJORAJ4uOqqiuBCyjqGJMnBWIy8xlOeZ/wCggYTt
Saas407pZa3Xpb9joJcxEts=
=kX7O
-END PGP SIGNATURE-


Re: version.c take two

2007-10-16 Thread Micah Cowan
Hrvoje Niksic wrote:
> Micah Cowan <[EMAIL PROTECTED]> writes:
> 
>> Alright; I'll make an extra effort to avoid non-portable Make
>> assumptions then. It's just... portable Make _sucks_ (not that
>> non-portable Make doesn't).
> 
> It might be fine to require GNU make if there is a good reason for it
> -- many projects do.  But requiring random bits and pieces of the "GNU
> toolchain", such as one or more of GNU Bash, GNU grep, GNU tar, or,
> well, printf :-), in most cases simply causes annoyance for very
> little added value.  Junior developers, or those only exposed to
> Linux, frequently simply assume that everyone has access to the tools
> they use on their development system, and fail to document that
> assumption.  I'm sure we can do better than that.

Oh, I quite agree. Sorry, I should have been more clear. (And I still
don't think "printf" should qualify as "part of the GNU toolchain" ;)
...it's been part of POSIX for a good long time.) I was mostly talking
about GNU Make, I think, and little else.

Basically, if it's not POSIX, I doubt I'll use it, and I'll tend to not
use it beyond how POSIX says it should work, unless I _know_ that the
extension I'm using is portable anyway. And even POSIX isn't perfect,
many systems fail to conform to it in various ways. I was recently
surprised to find that the awk Ubuntu ships with by default (mawk), does
not support POSIX character classes ([[:space:]] etc), and had to modify
the fun little script I use to colorize include the shell's joblist in
the prompt (http://micah.cowan.name/hg/promptjobs/).

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



signature.asc
Description: OpenPGP digital signature


Re: [Patch] Plug some memleaks

2007-10-16 Thread Hrvoje Niksic
Micah Cowan <[EMAIL PROTECTED]> writes:

>> Note that, technically, those are not leaks in real need of
>> plugging because they get called only once, i.e. they do not
>> accumulate ("leak") unused memory.  Of course, it's still a good
>> idea to remove them, if nothing else, then to remove false
>> positives from DEBUG_MALLOC builds.
>
> I love that valgrind distinguishes these from  "unreachable" unfreed
> memory.

By the way, now that valgrind exists, it may be time to consider
retiring DEBUG_MALLOC, a venerable hack from pre-valgrind days.  I'm
kind of surprised that anyone even uses it.  :-)


Re: version.c take two

2007-10-16 Thread Hrvoje Niksic
Micah Cowan <[EMAIL PROTECTED]> writes:

> Alright; I'll make an extra effort to avoid non-portable Make
> assumptions then. It's just... portable Make _sucks_ (not that
> non-portable Make doesn't).

It might be fine to require GNU make if there is a good reason for it
-- many projects do.  But requiring random bits and pieces of the "GNU
toolchain", such as one or more of GNU Bash, GNU grep, GNU tar, or,
well, printf :-), in most cases simply causes annoyance for very
little added value.  Junior developers, or those only exposed to
Linux, frequently simply assume that everyone has access to the tools
they use on their development system, and fail to document that
assumption.  I'm sure we can do better than that.


Re: [Patch] Plug some memleaks

2007-10-16 Thread Micah Cowan
Hrvoje Niksic wrote:
> Gisle Vanem <[EMAIL PROTECTED]> writes:
> 
>> Building with 'DEBUG_MALLOC' reveals some memory leaks that
>> should be plugged IMHO. Recursive d/l has lots more leaks which I didn't
>> address with the following patch (yet).
> 
> Note that, technically, those are not leaks in real need of plugging
> because they get called only once, i.e. they do not accumulate
> ("leak") unused memory.  Of course, it's still a good idea to remove
> them, if nothing else, then to remove false positives from
> DEBUG_MALLOC builds.

I love that valgrind distinguishes these from  "unreachable" unfreed memory.

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



signature.asc
Description: OpenPGP digital signature


Re: version.c take two

2007-10-16 Thread Micah Cowan
Hrvoje Niksic wrote:
> Micah Cowan <[EMAIL PROTECTED]> writes:
> 
>> I may take liberties with the Make environment, and assume the
>> presence of a GNU toolset, though I'll try to avoid that where it's
>> possible.
> 
> Requiring the GNU toolset puts a large burden on the users of non-GNU
> systems (both free and non-free ones).  Please remember that for many
> Unix users and sysadmins Wget is one of the core utilities, to be
> compiled very soon after a system is set up.  Each added build
> dependency makes Wget that much harder to compile on a barebones
> system.

Alright; I'll make an extra effort to avoid non-portable Make
assumptions then. It's just... portable Make _sucks_ (not that
non-portable Make doesn't).

>> In cases like this, printf is much more portable (in behavior) than
>> echo, but not as dependable (on fairly old systems) for presence;
>> however, it's not a difficult tool to obtain, and I wouldn't mind
>> making it a prerequisite for Wget (on Unix systems, at any rate). In
>> a pinch, one could write an included tool (such as an "echo" command
>> that does precisely what we expect) to help with building. But
>> basically, if it's been in POSIX a good while, I'll probably expect
>> it to be available for the Unix build.
> 
> Such well-intended reasoning tends to result with a bunch reports
> about command/feature X not being present on the reporter's system, or
> about a bogus version that doesn't work being picked up, etc.  But
> maybe the times have changed -- we'll see.

Given that there is no portable way to avoid newlines with echo, or to
depend on the results of including a backslash in its argument, it may
be hard to avoid, depending on what we need it for (with my last
revision of the Make rule, however, I've avoided it for this specific
purpose). Any modern Unix had pretty dang well include it. Non-modern
Unixen won't generally be needing to bootstrap, I'd think (and probably
already include older versions of wget anyway).

Thanks for the input, Hrvoje.

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



signature.asc
Description: OpenPGP digital signature


Re: version.c take two

2007-10-16 Thread Micah Cowan
Micah Cowan wrote:
> I've improved the generation of version.c, removing the intermediate
> generation of an "hg-id" file and using a more portable replacement for
> "hg id | cut -d ' ' -f 1" (can be used on Windows and MS-DOS).
> 
> The relevant lines in src/Makefile.am are now:
> 
> version.c:  $(wget_SOURCES) $(LDADD)
> printf '%s' 'const char *version_string = "@VERSION@' > $@
> -hg log -r tip --template=' ({node|short})' >> $@
> printf '%s\n' '";' >> $@
> 
> (The printf commands aren't portable to Windows AFAIK, but this should
> be easier to adapt, at any rate. Note that "echo -n" is not a portable
> method for suppressing newlines in echo's output.)

Gisle and Chris, you should be able to write this rule in your
Makefiles. Something like:

version.c: $(SOURCES)
echo 'const char *version_string = "@VERSION@"' > $@
-hg log -r tip --template='" ({node|short})"\n' >> $@
echo ';' >> $@

This particular usage should actually be portable across all varying
implementations of Unix echo, as well (so maybe I'll use it too--though
printf is still probably a reasonable expectation, and I may well
require it in the future). It takes advantage of C's string-literal
concatenation. The results will be an eyesore, but will work.

Note that if version.c is in SOURCES, there is still a recursive
dependency; if this is a problem for your build system, you may want to
remove version.obj from OBJS, and add it directly to the command to link
wget.exe.

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



signature.asc
Description: OpenPGP digital signature


Re: version.c take two

2007-10-16 Thread Hrvoje Niksic
Micah Cowan <[EMAIL PROTECTED]> writes:

> I may take liberties with the Make environment, and assume the
> presence of a GNU toolset, though I'll try to avoid that where it's
> possible.

Requiring the GNU toolset puts a large burden on the users of non-GNU
systems (both free and non-free ones).  Please remember that for many
Unix users and sysadmins Wget is one of the core utilities, to be
compiled very soon after a system is set up.  Each added build
dependency makes Wget that much harder to compile on a barebones
system.

> In cases like this, printf is much more portable (in behavior) than
> echo, but not as dependable (on fairly old systems) for presence;
> however, it's not a difficult tool to obtain, and I wouldn't mind
> making it a prerequisite for Wget (on Unix systems, at any rate). In
> a pinch, one could write an included tool (such as an "echo" command
> that does precisely what we expect) to help with building. But
> basically, if it's been in POSIX a good while, I'll probably expect
> it to be available for the Unix build.

Such well-intended reasoning tends to result with a bunch reports
about command/feature X not being present on the reporter's system, or
about a bogus version that doesn't work being picked up, etc.  But
maybe the times have changed -- we'll see.


Re: version.c take two

2007-10-16 Thread Maciej W. Rozycki
On Tue, 16 Oct 2007, Micah Cowan wrote:

> I may take liberties with the Make environment, and assume the presence
> of a GNU toolset, though I'll try to avoid that where it's possible.

 Well, the issue has been resolved one way or another with many GNU 
packages, including the core ones such as GCC, which generally try to be 
as much portable as possible.  It is always worth investigating what 
others have done and copying the good examples.  Please also note the 
autoconf manual has chapters on shell and make portability issues which 
provide examples how to do many things in a reliable way.

> In cases like this, printf is much more portable (in behavior) than
> echo, but not as dependable (on fairly old systems) for presence;
> however, it's not a difficult tool to obtain, and I wouldn't mind making
> it a prerequisite for Wget (on Unix systems, at any rate). In a pinch,
> one could write an included tool (such as an "echo" command that does
> precisely what we expect) to help with building. But basically, if it's
> been in POSIX a good while, I'll probably expect it to be available for
> the Unix build.

 You better avoid the hassle of compiling any code to be run on the build 
system by any measures unless absolutely necessary or you risk falling 
down into a hole of horrible traps and riddles once you start 
cross-compiling.

  Maciej


Re: version.c take two

2007-10-16 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hrvoje Niksic wrote:
> Micah Cowan <[EMAIL PROTECTED]> writes:
> 
>> version.c:  $(wget_SOURCES) $(LDADD)
>> printf '%s' 'const char *version_string = "@VERSION@' > $@
>> -hg log -r tip --template=' ({node|short})' >> $@
>> printf '%s\n' '";' >> $@
> 
> "printf" is not portable to older systems, but that may not be a
> problem anymore.  What are the current goals regarding portability?

GNU, modern Unixen, and Windows systems (in that order) take priority,
but portability to other systems is desirable if it's not out of the way.

I may take liberties with the Make environment, and assume the presence
of a GNU toolset, though I'll try to avoid that where it's possible.

In cases like this, printf is much more portable (in behavior) than
echo, but not as dependable (on fairly old systems) for presence;
however, it's not a difficult tool to obtain, and I wouldn't mind making
it a prerequisite for Wget (on Unix systems, at any rate). In a pinch,
one could write an included tool (such as an "echo" command that does
precisely what we expect) to help with building. But basically, if it's
been in POSIX a good while, I'll probably expect it to be available for
the Unix build.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHFHFV7M8hyUobTrERCDG3AJ0WGcRcE9423lSXcasZ5uTxS2HXMACfQqe8
vI1aTSxAHqPrxQPZuTIzpjM=
=gXYM
-END PGP SIGNATURE-


Re: version.c take two

2007-10-16 Thread Hrvoje Niksic
Micah Cowan <[EMAIL PROTECTED]> writes:

> version.c:  $(wget_SOURCES) $(LDADD)
> printf '%s' 'const char *version_string = "@VERSION@' > $@
> -hg log -r tip --template=' ({node|short})' >> $@
> printf '%s\n' '";' >> $@

"printf" is not portable to older systems, but that may not be a
problem anymore.  What are the current goals regarding portability?