Re: wget default behavior

2007-10-14 Thread Hrvoje Niksic
Tony Godshall [EMAIL PROTECTED] writes:

 OK, so let's go back to basics for a moment.

 wget's default behavior is to use all available bandwidth.

And so is the default behavior of curl, Firefox, Opera, and so on.
The expected behavior of a program that receives data over a TCP
stream is to consume data as fast as it arrives.


Re: Myriad merges

2007-10-14 Thread Jochen Roderburg
Zitat von Micah Cowan [EMAIL PROTECTED]:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256

 Micah Cowan wrote:
  Jochen Roderburg wrote:
  Unfortunately, however, a new regression crept in:
  In the case timestamping=on, content-disposition=off, no local file
 present it
  does now no HEAD (correctly), but two (!!) GETS and transfers the file two
  times.
 
  Ha! Okay, gotta get that one fixed...

 That should now be fixed.

 It's hard to be confident I'm not introducing more issues, with the
 state of http.c being what it is. So please beat on it! :)

This time it survived the beating  ;-)
Seems that we are finally converging. The double GET is gone, and my other test
cases still work as expected, including the -c variants.

 One issue I'm still aware of is that, if -c and -e
 contentdisposition=yes are specified for a file already fully
 downloaded, HEAD will be sent for the contentdisposition, and yet a GET
 will still be sent to fetch the remainder of the -c (resulting in a 416
 Requested Range Not Satisfiable). Ideally, Wget should be smart enough
 to see from the HEAD that the Content-Length already matches the file's
 size, even though -c no longer requires a HEAD (again). We _got_ one, we
 should put it to good use.

 However, I'm not worried about addressing this before 1.11 releases;
 it's a minor complaint, and with content-disposition's current
 implementation, users are already going to be expecting an extra HEAD
 round-trip in the general case; what's a few extra?

Agreed. I can confirm this behaviour, too. And I would also consider this a
minor issue, at least the result is correct.

I have also not made many tests where content-disposition is really used for the
filename. Those few real-live cases that I have at hand do not send any
special headers like timestamnps and filelengths with it. At least the local
filename is set correctly and is correctly renamed if it exists.

Best regards and thanks again for the repair of all the issues that I found,

Jochen Roderburg



Re: css @import parsing

2007-10-14 Thread Andreas Pettersson

Andreas Pettersson wrote:

Have there been any progress with this patch since this post?
http://www.mail-archive.com/wget@sunsite.dk/msg09502.html

*bump*

Anyone knows the status of this?

--
Andreas




Re: Myriad merges

2007-10-14 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Jochen Roderburg wrote:
 Zitat von Micah Cowan [EMAIL PROTECTED]:

 It's hard to be confident I'm not introducing more issues, with the
 state of http.c being what it is. So please beat on it! :)
 
 This time it survived the beating  ;-)

Yay!! :D

 One issue I'm still aware of is that, if -c and -e
 contentdisposition=yes are specified for a file already fully
 downloaded, HEAD will be sent for the contentdisposition, and yet a GET
 will still be sent to fetch the remainder of the -c (resulting in a 416
 Requested Range Not Satisfiable). Ideally, Wget should be smart enough
 to see from the HEAD that the Content-Length already matches the file's
 size, even though -c no longer requires a HEAD (again). We _got_ one, we
 should put it to good use.

 However, I'm not worried about addressing this before 1.11 releases;
 it's a minor complaint, and with content-disposition's current
 implementation, users are already going to be expecting an extra HEAD
 round-trip in the general case; what's a few extra?
 
 Agreed. I can confirm this behaviour, too. And I would also consider this a
 minor issue, at least the result is correct.
 
 I have also not made many tests where content-disposition is really used for 
 the
 filename. Those few real-live cases that I have at hand do not send any
 special headers like timestamnps and filelengths with it. At least the local
 filename is set correctly and is correctly renamed if it exists.

And I expect there are probably several bugs lurking here (which is why
I've designated it as experimental). After the 1.11 release I want to
revisit that section, and look more closely at what happens if we get a
Content-Disposition at the last minute, especially if it specifies a
local file name that we are rejecting. I'd prefer that it not use HEAD
at all for that, as I expect Content-Disposition is rare enough that it
doesn't justify issuing HEAD just to see if its present; and in any case
it probably frequently isn't sent with HEAD responses, but only for GET.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHElVE7M8hyUobTrERCOG5AJ9xsAPlFyhXXC28E5TeqnoKXWuLPACbBAFN
SfRAf4ZfMFwvYXDKlcDV3dA=
=ZHVD
-END PGP SIGNATURE-


Re: css @import parsing

2007-10-14 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Andreas Pettersson wrote:
 Andreas Pettersson wrote:
 Have there been any progress with this patch since this post?
 http://www.mail-archive.com/wget@sunsite.dk/msg09502.html
 *bump*
 
 Anyone knows the status of this?

Not yet installed... don't know what else to tell you, except that it's
slated to be included in Wget 1.12. Wget 1.11 is expected to be released
quite soon (just waiting for resolution of some licensing stuff), and
I'm afraid to say that CSS support won't be in in time for that.

However, I too am very interested to see CSS support included in Wget;
it'll be in when we have time to look at it more closely, and is one of
my higher priorities for Wget 1.12.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHEldV7M8hyUobTrERCFsPAJ449TvEoo6IZVs5PP+fivSo4Hh6twCdEyjc
B8GWbP8CyVgV7GaY1n6qEx8=
=vYhq
-END PGP SIGNATURE-


Re: wget default behavior [was Re: working on patch to limit to percent of bandwidth]

2007-10-14 Thread Tony Godshall
On 10/13/07, Josh Williams [EMAIL PROTECTED] wrote:
 On 10/13/07, Tony Godshall [EMAIL PROTECTED] wrote:
  Well, you may have such problems but you are very much reaching in
  thinking that my --linux-percent has anything to do with any failing
  in linux.
 
  It's about dealing with unfair upstream switches, which, I'm quite
  sure, were not running Linux.
 
  Let's not hijack this into a linux-bash.

 I really don't know what you were trying to say here...

You seemed to think --limit-percent was a solution for a misbehavior of linux.

My experience with linux networking is that it's very effective and
that upstream non-linux switches don't handle such an effective client
well.

When a linux box is my gateway/firewall I don't experience
single-client monopolization at all.

As to your linux issues, that's a topic that should probably discussed
in another forum, but I will say that I'm quite happy with the latest
Linux kernels- with the low-latency patch integrated and enabled my
desktop experience is quite snappy, even on this four-year-old 1.2GHz
laptop.  And stay away from the distro server kernels- they are
optimized for throughput at the cost of latency- they do their I/O in
bigger chunks.  And stay away from the RT kernels- they go too far in
giving I/O priority over everything else and end up churning on IRQs
unless they are very carefully tuned.

And no, I won't call the linux kernel GNU/Linux, if that was what you
were after.  The kernel is after all the one Linux thing in a
GNU/Linux system.

 .. I use GNU/Linux.

Anyone try Debian GNU/BSD yet?  Or Debian/Nexenta/GNU/Solaris?

-- 
Best Regards.
Please keep in touch.


Re: wget default behavior

2007-10-14 Thread Tony Godshall
On 10/14/07, Hrvoje Niksic [EMAIL PROTECTED] wrote:
 Tony Godshall [EMAIL PROTECTED] writes:

  OK, so let's go back to basics for a moment.
 
  wget's default behavior is to use all available bandwidth.

 And so is the default behavior of curl, Firefox, Opera, and so on.
 The expected behavior of a program that receives data over a TCP
 stream is to consume data as fast as it arrives.

Yup.


RE: Version tracking in Wget binaries

2007-10-14 Thread Christopher G. Lewis
OK, so I'm trying to be open minded and deal with yet another version
control system.

I've cloned the repository and built my mainline.  I do not
autogenerate a version.c file in windows.  Build fails missing
version.obj.  

Note that in the windows world, we use Nmake from the MSVC install - no
GNU tools required.

An aside on Hg...

Confirm for me that I basically need to do the following:

Create a clone repository:
  hg clone http://hg.addictivecode.org/wget/mainline

Get any changes from mainline into my clone 
  hg pull http://hg.addictivecode.org/wget/mainline

Make my src changes, create a changeset... And then I'm lost...

And as a follow-up question - what does Hg get you above and beyond CVS
or SVN?  I kind of get the non-centralized aspect of repositories and
clones, but I don't understand how changesets and tips work.  My
thoughts are that there is *one* source of the code (with histories)
regardless of SVN, Hg or whatever.  

  Hg's concept of multiple clones and repositories is quite interesting,
but doesn't feel right for the remote, non-connected group of developers
that wget gathers input from.  If we were all behind a firewall or could
share out each user's repository, it might make more sense, but I (for
one) wouldn't be able to share my repository (NAT'd, firewalled,
corporate desktop), so I just don't get it.  


Chris


Christopher G. Lewis
http://www.ChristopherLewis.com
 

 -Original Message-
 From: Micah Cowan [mailto:[EMAIL PROTECTED] 
 Sent: Saturday, October 13, 2007 4:59 AM
 To: Wget
 Subject: Re: Version tracking in Wget binaries
 
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256
 
 Micah Cowan wrote:
  Hrvoje Niksic wrote:
  Micah Cowan [EMAIL PROTECTED] writes:
  
  Among other things, version.c is now generated rather than
  parsed. Every time make all is run, which also means that make
  all will always relink the wget binary, even if there 
 haven't been
  any changes.
  I personally find that quite annoying.  :-(  I hope there's a very
  good reason for introducing that particular behavior.
  
  Well, making version.c a generated file is necessary to get the
  most-recent revision for the working directory. I'd like to 
 avoid it,
  obviously, but am not sure how without making version.c dependent on
  every source file. But maybe that's the appropriate fix. It 
 shouldn't be
  too difficult to arrange; probably just
version.c:  $(wget_SOURCES)
  or similar.
 
 version.c is no longer unconditionally generated. The 
 secondary file,
 hg-id, which is generated to contain the revision id (and is used to
 avoid using GNU's $(shell ...) extension, which autoreconf complains
 about), depends on $(wget_SOURCES), and $(LDADD) (so that it properly
 includes conditionally-used sources such as http-ntlm.c or gen-md5.c
 when applicable).
 
 This has the advantage that every make does not result in 
 regenerating
 version.c, recompiling version.c and relinking wget. It has the
 potential disadvantage that, since $(wget_SOURCES) includes version.c
 itself, there is the circular dependency: version.c - hg-id -
 version.c. GNU Make is smart enough to catch that and throw that
 dependency out.
 
 - --
 Micah J. Cowan
 Programmer, musician, typesetting enthusiast, gamer...
 http://micah.cowan.name/
 
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.6 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
 
 iD8DBQFHEJb07M8hyUobTrERCE4rAJ9gKXonGN9bRydErVkxtZF8g723CACeLbhD
 VYUyd0MnjBdjcRXMSTge0ZE=
 =cC2V
 -END PGP SIGNATURE-
 


Re: Version tracking in Wget binaries

2007-10-14 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Christopher G. Lewis wrote:
 OK, so I'm trying to be open minded and deal with yet another version
 control system.
 
 I've cloned the repository and built my mainline.  I do not
 autogenerate a version.c file in windows.  Build fails missing
 version.obj.  

Right; I think I mentioned that would happen.

 Note that in the windows world, we use Nmake from the MSVC install - no
 GNU tools required.

Right; and I don't expect that you'll be able to do it exactly as I've
done. However, the contents of src/Makefile.am should give good hints
about how it could be done in Nmake. AFAIK, the only thing Unix-specific
about the rules as I've done them, in fact, is the use of the Unix cut
command. If absolutely necessary, that part could be removed, with the
Nmake rules similar to:

hg-id: $(OBJS)
-hg id  $@

It's just that only the first word is needed.

 An aside on Hg...
 
 Confirm for me that I basically need to do the following:
 
 Create a clone repository:
   hg clone http://hg.addictivecode.org/wget/mainline

Approximate equivalent to svn co.

 Get any changes from mainline into my clone 
   hg pull http://hg.addictivecode.org/wget/mainline

Equivalent to svn up.

 Make my src changes, create a changeset... And then I'm lost...

Alright, so you can make your changes, and issue an hg diff, and
you've basically got what you used to do with svn.

Or, if they're larger changes, you can run hg ci periodically as you
change, to save progress so to speak.

 And as a follow-up question - what does Hg get you above and beyond CVS
 or SVN?  I kind of get the non-centralized aspect of repositories and
 clones, but I don't understand how changesets and tips work.

Well, changesets are in all SCMs, as far as I know. A changeset is just
the set-of-changes that you check in when you do svn ci or hg ci.
Every revision id corresponds to and identifies a changeset.

tip is just the Mercurial equivalent of Subversion's HEAD. In
Mercurial, the tip is always the very last revision made, whereas
heads are the last revision made to each unclosed branch in a repository.

 My thoughts are that there is *one* source of the code (with histories)
 regardless of SVN, Hg or whatever.

One official one, sure.

For me, the major advantages are that I can be working on several
things, each with history, without touching the official repository. I
can work on large changes while I'm in my car while my wife drives the
family out-of-town, without having to worry about screwing something up
that I can't back up to a good point (other than back to the last
official point in the repo, or whatever I had the foresight to cp
- -r). And, I can check in changes where each time it takes a hair of a
second, and then push it all over the net when I'm ready for it to be
sent, instead of taking several seconds for each commit. Believe me, you
begin to appreciate that after a few times.

Admittedly, these advantages are mainly advantages to pretty active
developers, which, at the moment, is pretty much just me. :) I've
definitely found use of a DVCS to be absolutely awesome for my purposes.

   Hg's concept of multiple clones and repositories is quite interesting,
 but doesn't feel right for the remote, non-connected group of developers
 that wget gathers input from.  If we were all behind a firewall or could
 share out each user's repository, it might make more sense, but I (for
 one) wouldn't be able to share my repository (NAT'd, firewalled,
 corporate desktop), so I just don't get it.

Sharing is a potentially useful aspect of DVCSses, to be sure, but it's
not all it's got going for it, and in fact isn't really the reason I
made the move.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHEugL7M8hyUobTrERCCM/AJ47cwY0rm0FBsEKH6PhKLwFiyTrxgCfasIY
GJiUAR8s7rX09O2F9ZIt4uQ=
=COwb
-END PGP SIGNATURE-


Wget gnulib-ized

2007-10-14 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Mainline now has replaced a few of Wget's portability pieces with
corresponding gnulib modules. This has resulted in significant changes
to what needs to be built where, so non-Unix builds are probably further
broken (...sorry, Chris, Gisle... *'n'*). Various Unix builds may
possibly have been broken as well; hopefully it'll come out in testing.

The pieces replaced were, I think, old code culled from libiberty or
otherwise from the GNU collective pool: gnu-md5 (now md5), getopt,
safe-ctype (now c-ctype). stdint.h and stdbool.h detection/replacement
were pulled in automatically through importing those modules, but I
haven't altered the build setup to use those instead of our own builtin
stuff yet.

So, at the moment, I've just introduced tremendous instability to
mainline with the only benefit being mildly updated equivalents to about
three files from the GNU collective. ^_^

However, I expect the payoff in the long run to be worth it, as I can
now more easily take advantage of other modules gnulib offers. I expect
that the inline module could be handy for taking advantage of build
environments that offer inlined functions, and of course getpass will be
useful (though we may need to special-case our handling of that one);
the quote (for dealing with strange characters when quoting, say,
filenames) and regex (same thing Emacs uses, I believe--for the proposed
regex support in -A, -R and the like) modules are also possibilities.
And, especially, there are several ADTs that I expect that I will need
shortly, in applications where string-hashes may not fill the need.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHEvME7M8hyUobTrERCMj0AJ9aKGdqCrz9SCuK31kl3dupJAbY9QCcCsJC
FE9In1CKb6xs1xYD2qoRcAk=
=V1sa
-END PGP SIGNATURE-