RE: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-11-02 Thread Tony Lewis
Micah Cowan wrote:

 Keeping a single Wget and using runtime libraries (which we were terming
 plugins) was actually the original concept (there's mention of this in
 the first post of this thread, actually); the issue is that there are
 core bits of functionality (such as the multi-stream support) that are
 too intrinsic to separate into loadable modules, and that, to be done
 properly (and with a minimum of maintenance commitment) would also
 depend on other libraries (that is, doing asynchronous I/O wouldn't
 technically require the use of other libraries, but it can be a lot of
 work to do efficiently and portably across OSses, and there are already
 Free libraries to do that for us).

Perhaps both versions can include multi-threaded support in their core version, 
but the lite version would never invoke multi-threading.

Tony



Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-11-02 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Tony Lewis wrote:
 Micah Cowan wrote:
 
 Keeping a single Wget and using runtime libraries (which we were
 terming plugins) was actually the original concept (there's
 mention of this in the first post of this thread, actually); the
 issue is that there are core bits of functionality (such as the
 multi-stream support) that are too intrinsic to separate into
 loadable modules, and that, to be done properly (and with a minimum
 of maintenance commitment) would also depend on other libraries
 (that is, doing asynchronous I/O wouldn't technically require the
 use of other libraries, but it can be a lot of work to do
 efficiently and portably across OSses, and there are already Free
 libraries to do that for us).
 
 Perhaps both versions can include multi-threaded support in their
 core version, but the lite version would never invoke
 multi-threading.

I mentioned this in the first post as well. The main problem I offered
for this was that async I/O tends to make for much more
complicated/hard-to-follow code, which will make the lite Wget (even
more) difficult to read, without reaping the actual benefits gained from
such complications. Of course, whether this is a sufficient
justification to maintain two different versions of Wget is another
question...

There's also the fact that libcurl starts looking _very_ attractive to
handle the async I/O web comm stuff, so that ideally we don't actually
have to rewrite any of the I/O and HTTP logic, but just replace it
wholesale. If we decide to use that for the async stuff, then it seemse
to me that having two separate programs suddenly becomes more-or-less a
foregone conclusion, as I don't really want to introduce a dependency on
libcurl for the lite Wget (though Hrvoje's response on the thread that
Daniel Stenberg posted suggests I'd have an excuse to do so).

Note that in any case, having two separate command-line interfaces is
pretty much unavoidable IMO, as the current CLI is fast becoming
unwieldly, and certain aspects are fairly confusing, so that I don't
really want to use it as the basis on which to build some of the newer
configuration features; at the same time, I want to keep the current
interface around for the current Wget usage, so I don't break people's
scripts, etc.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHK3kf7M8hyUobTrERCCe6AJ93sxZkba5yDcaTF1asibpHZdjkzgCgiH0T
9xed5XQH/CEbZmknLpUtRPo=
=L3hf
-END PGP SIGNATURE-


Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-11-02 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Micah Cowan wrote:
 Tony Lewis wrote:
 Perhaps both versions can include multi-threaded support in their
 core version, but the lite version would never invoke
 multi-threading.
 
 I mentioned this in the first post as well. The main problem I offered
 for this was that async I/O tends to make for much more

I should point out, too, that I'm talking about asynchronous I/O
support, and not multithreaded support, as I'm not really keen on
introducing threads to Wget. Especially since, AFAICT, threads sort of
suck on Linux, which happens to be the kernel I actively use. This may
be somewhat unfortunate, as multithreading code tends not to introduce
the code complexity that async I/O does (though IMO it introduces
complexities of a different sort).

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHK3uA7M8hyUobTrERCM38AJ9BhohEVNuRl2P1rnsjWO/gEgFxCACgjIf3
9hyCb8WHZIFQLZ1UCCaqK5A=
=siMR
-END PGP SIGNATURE-


Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-11-02 Thread L Walsh



Micah Cowan wrote:

I'm not sure what you mean about the linux thing; there are many
instances of runtime loadable modules on Linux. dlopen() and friends are
the standard way of doing this on any Unix kernel flavor.


I _thought_ so, but when I asked a distro why they didn't
use this, they said it would require rewriting nearly all currently
existing applications.

My specific complain was against a SuSE distro, that in
in order to load one.rpm, it depended on two.rpm, which depended on
three.rpm, and that on four.rpm, etc. The functionality in two.rpm
was to load a library to handle active directories which, in my
non-MS, small setup, I didn't need -- and I didn't want to load
the 5-7 supporting packages for AD, since I didn't use them.
BUT, because of static-run time loading, one.rpm would fail if two.rpm
wasn't loaded...and so on and so forth.  AFAIK, the same problem
exists on nearly every distro -- because no one bothers to think
that they might not want to load every package on the CD, just
to support local host lookup using...say nscd.  G.



Keeping a single Wget and using runtime libraries (which we were terming
plugins) was actually the original concept (there's mention of this in
the first post of this thread, actually); 

---
Sounds good to me! :-)


the issue is that there are
core bits of functionality (such as the multi-stream support) that are
too intrinsic to separate into loadable modules, and that, to be done
properly (and with a minimum of maintenance commitment) would also
depend on other libraries (that is, doing asynchronous I/O wouldn't
technically require the use of other libraries, but it can be a lot of
work to do efficiently and portably across OSses, and there are already
Free libraries to do that for us).

-
And perhaps that is the problem.  In order to re-use existing
parts of code, rather than adopted them to a load-if-necessary type
structure -- everyone prefers to just use them as is, thus one lib
references another, and another...and so on.  Like I think you pull
in cat, and you get all of the gnu-language libs and tools, which
pulls in alternate character set support, which requires certain
font rendering packages -- and of course, if you are display
alternate characters, lets not forget the corresponding foreign
input methods, and the asian-char specific terminal emulators...etc.

Can I jump off a cliff yet?...ARG!  I hack around such problems,
at times, by extracting the 1 run-time library I need, and not the
rest of the package, but then my rpm-verify checks turn up supposed
errors because I'm missing package dependencies.  Sigh...

If one wanted to add multi-stream support, couldn't the
small wget have a check to see if the multi-stream support lib
was present (or not), and if so, set max-streams equal to one that
might yield the basic behavior one might want for the small wget?

Not pushing a particular solution -- I, like you, am just throwing
out ideas to consider...if they've already covered the points I've
raised, feel free to just ignore my ramblings and carry on...:-)
Linda


Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-11-02 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

L Walsh wrote:
 Micah Cowan wrote:
 I'm not sure what you mean about the linux thing; there are many
 instances of runtime loadable modules on Linux. dlopen() and friends are
 the standard way of doing this on any Unix kernel flavor.
 
 I _thought_ so, but when I asked a distro why they didn't
 use this, they said it would require rewriting nearly all currently
 existing applications.
 
 My specific complain was against a SuSE distro, that in
 in order to load one.rpm, it depended on two.rpm, which depended on
 three.rpm, and that on four.rpm, etc. The functionality in two.rpm
 was to load a library to handle active directories which, in my
 non-MS, small setup, I didn't need -- and I didn't want to load
 the 5-7 supporting packages for AD, since I didn't use them.
 BUT, because of static-run time loading, one.rpm would fail if two.rpm
 wasn't loaded...and so on and so forth.  AFAIK, the same problem
 exists on nearly every distro -- because no one bothers to think
 that they might not want to load every package on the CD, just
 to support local host lookup using...say nscd.  G.

Ah, well, that's a different situation. In order to decide at runtime
whether to load a runtime library or not, dlopen() is the standard way
to handle that. However, if the application wasn't designed to make the
decision at runtime, but rather at build time, then it does require code
rewriting.

In this case, though, we're specifically talking about loadable modules.
We might choose to allow some of them to be linked at build time, but
we'd definitely have to at least support conditional linking at runtime.

 Keeping a single Wget and using runtime libraries (which we were terming
 plugins) was actually the original concept (there's mention of this in
 the first post of this thread, actually); 
 ---
 Sounds good to me! :-)
 
 the issue is that there are
 core bits of functionality (such as the multi-stream support) that are
 too intrinsic to separate into loadable modules, and that, to be done
 properly (and with a minimum of maintenance commitment) would also
 depend on other libraries (that is, doing asynchronous I/O wouldn't
 technically require the use of other libraries, but it can be a lot of
 work to do efficiently and portably across OSses, and there are already
 Free libraries to do that for us).
 -
 And perhaps that is the problem.  In order to re-use existing
 parts of code, rather than adopted them to a load-if-necessary type
 structure -- everyone prefers to just use them as is, thus one lib
 references another, and another...and so on.  Like I think you pull
 in cat, and you get all of the gnu-language libs and tools, which
 pulls in alternate character set support, which requires certain
 font rendering packages -- and of course, if you are display
 alternate characters, lets not forget the corresponding foreign
 input methods, and the asian-char specific terminal emulators...etc.

That's retarded. Native Language Support for a terminal program
shouldn't pull in font-rendering packages: displaying the characters
properly is the terminal's responsibility. I have some trouble believing
that any packagers would actually have such dependencies, but if they
do, it's retarded. A program like cat should depend only on the system
library, and (if NLS is supported) gettext (which shouldn't depend on
anything else).

 Can I jump off a cliff yet?...ARG!  I hack around such problems,
 at times, by extracting the 1 run-time library I need, and not the
 rest of the package, but then my rpm-verify checks turn up supposed
 errors because I'm missing package dependencies.  Sigh...

Frustrating experiences with RedHat's package management is why I'm now
a Debian/Ubuntu user. :)

 If one wanted to add multi-stream support, couldn't the
 small wget have a check to see if the multi-stream support lib
 was present (or not), and if so, set max-streams equal to one that
 might yield the basic behavior one might want for the small wget?

Well, but the actual support for having any sort of multi-stream is a
major rewrite of the entire I/O code. Much better to use a separate
library for that, if we can get it. In that case, it stops being
something we can simply check for and use if it's available, but
something that the code would absolutely require.

 Not pushing a particular solution -- I, like you, am just throwing
 out ideas to consider...if they've already covered the points I've
 raised, feel free to just ignore my ramblings and carry on...:-)

Well, and fortunately we've got plenty of time to talk about these
things: my focus right now is on getting 1.11 out the door, after which
there are _plenty_ of things to keep me busy for 1.12 (still a lite
release) for quite some time.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: 

Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-11-01 Thread Tony Godshall
On 10/31/07, Micah Cowan [EMAIL PROTECTED] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256

 Tony Godshall wrote:
  On 10/30/07, Micah Cowan [EMAIL PROTECTED] wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA256
 
  Tony Godshall wrote:
  Perhaps the little wget could be called wg.  A quick google and
  wikipedia search shows no real namespace collisions.
  To reduce confusion/upgrade problems, I would think we would want to
  ensure that the traditional/little Wget keeps the current name, and
  any snazzified version gets a new one.
 
  Please not another -ng.  How about wget2 (since we're on 1.x).  And
  the current one remains in 1.x.

 I agree that -ng would not be appropriate. But since we're really
 talking about two separate beasts, I'd prefer not to limit what we can
 do with Wget (original)'s versioning. Who's to say a 2.0 release of the
 light version will not be warranted someday?

 At any rate, the snazzy one looks to be diverging from classic Wget in
 some rather significant ways, in which case, I'd kind of prefer to part
 names a bit more severely than just wget-ng or wget2. Reget,
 perhaps: that name could be both Recursive Get (describing what's
 still its primary feature), or Revised/Re-envisioned Wget. :)

 I think, too, that names such as wget2 are more often things that
 packagers (say, Debian) do, when they want to include
 backwards-incompatible, significantly new versions of software, but
 don't want to break people's usage of older stuff. Or, when they just
 want to offer both versions. Cf apache2 in Debian.

  And then eventually everyone's gotten used to used to and can't live
  without the new bittorrent-like almost-multithreaded features. ;-)

 :)

Pget.

Parallel get.

Tget.

Torrent-like-get.

Bget.

Bigger get.

BBWget.

Bigger Better wget.

OK, ok sorry.


Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-11-01 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

L Walsh wrote:
 Honest -- I hadn't read all the threads before my post...
 
 Great ideas Micah! :-)
 
 On the idea of 2 wgets -- there is a clever way to get
 by with 1.  Put the optional functionality into separate
 run-time loadable files.  SGI's Unix (and MS Windows) do this.
 The small wget then checks to see which libraries are
 accessible -- those that aren't simply mean the features
 for those libs are disabled.  In a way, it's like how
 'vim' can optionally load perllib or python-lib at runtime
 (at least under windows) if they are present.  If they are
 not present, those features are disabled.  Too bad linux
 didn't take this route with its libraries (have asked,
 it is possible, but there's no framework for it, and
 that might need work as well.

I'm not sure what you mean about the linux thing; there are many
instances of runtime loadable modules on Linux. dlopen() and friends are
the standard way of doing this on any Unix kernel flavor.

Keeping a single Wget and using runtime libraries (which we were terming
plugins) was actually the original concept (there's mention of this in
the first post of this thread, actually); the issue is that there are
core bits of functionality (such as the multi-stream support) that are
too intrinsic to separate into loadable modules, and that, to be done
properly (and with a minimum of maintenance commitment) would also
depend on other libraries (that is, doing asynchronous I/O wouldn't
technically require the use of other libraries, but it can be a lot of
work to do efficiently and portably across OSses, and there are already
Free libraries to do that for us).

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHKo867M8hyUobTrERCBxGAJ44coJN48fRGhORfYv+uN2J6RVz7gCePxva
UYeGYTW0sfY+QRcGkpSB9Ls=
=wOVv
-END PGP SIGNATURE-


Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-11-01 Thread L Walsh

Honest -- I hadn't read all the threads before my post...

Great ideas Micah! :-)

On the idea of 2 wgets -- there is a clever way to get
by with 1.  Put the optional functionality into separate
run-time loadable files.  SGI's Unix (and MS Windows) do this.
The small wget then checks to see which libraries are
accessible -- those that aren't simply mean the features
for those libs are disabled.  In a way, it's like how
'vim' can optionally load perllib or python-lib at runtime
(at least under windows) if they are present.  If they are
not present, those features are disabled.  Too bad linux
didn't take this route with its libraries (have asked,
it is possible, but there's no framework for it, and
that might need work as well.

My 2 cents,
Linda



Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-10-31 Thread Tony Godshall
On 10/30/07, Micah Cowan [EMAIL PROTECTED] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256

 Tony Godshall wrote:
  Perhaps the little wget could be called wg.  A quick google and
  wikipedia search shows no real namespace collisions.

 To reduce confusion/upgrade problems, I would think we would want to
 ensure that the traditional/little Wget keeps the current name, and
 any snazzified version gets a new one.

Please not another -ng.  How about wget2 (since we're on 1.x).  And
the current one remains in 1.x.

And then eventually everyone's gotten used to used to and can't live
without the new bittorrent-like almost-multithreaded features. ;-)

Tony


Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-10-30 Thread Tony Godshall
On 10/26/07, Josh Williams [EMAIL PROTECTED] wrote:
 On 10/26/07, Micah Cowan [EMAIL PROTECTED] wrote:
  And, of course, when I say there would be two Wgets, what I really
  mean by that is that the more exotic-featured one would be something
  else entirely than a Wget, and would have a separate name.

 I think the idea of having two Wgets is good. I too have been
 concerned about the resources required in creating the all-out version
 2.0. The current code for Wget is a bit mangled, but I think the basic
 concepts surrounding it are very good ones. Although the code might
 suck for those trying to read it, I think it could be very great with
 a little regular maintenance.

Perhaps the little wget could be called wg.  A quick google and
wikipedia search shows no real namespace collisions.

 There still remains the question, though, of whether version 2 will
 require a complete rewrite. Considering how fundamental these changes
 are, I don't think we would have much of a choice. You mentioned that
 they could share code for recursion, but I don't see how. IIRC, the
 code for recursion in the current version is very dependent on the
 current methods of operation. It would probably have to be rewritten
 to be shared.

 As for libcurl, I see no reason why not. Also, would these be two
 separate GNU projects? Would they be packaged in the same source code,
 like finch and pidgin?

 I do believe the next question at hand is what version 2's official
 mascot will be. I purpose Lenny the tortoise ;)

Oooh- confusion with Debian testing

_  ..
 Lenny -  (_\/  \_,
 'uuuu~'



-- 
Best Regards.
Please keep in touch.


Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-10-30 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Tony Godshall wrote:
 Perhaps the little wget could be called wg.  A quick google and
 wikipedia search shows no real namespace collisions.

To reduce confusion/upgrade problems, I would think we would want to
ensure that the traditional/little Wget keeps the current name, and
any snazzified version gets a new one.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHJ59b7M8hyUobTrERCLs9AJ478M50hIs4hMegAGYhKEXL5tCaAgCdGR+e
5A6mtbAq2iX6Azvcfbd10cI=
=SXun
-END PGP SIGNATURE-


Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-10-30 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Daniel Stenberg wrote:
 I guess I'm not the man to ask nor comment this a lot, but look what I
 found:
 
   http://www.mail-archive.com/wget@sunsite.dk/msg01129.html
 
 I've always thought and I still believe that wget's power and most
 appreciated abilities are in the features it adds on top of the
 transfer, like HTML parsing, ftp list parsing and the other things you
 mentioned.

Of course, in this case, we'd be talking more about linking with libcurl
for Wget2, rather than incorporating it, so we wouldn't have to worry
about copyright disclaimers. Besides which, according to the maintainers
document, we only need to get those for files that do not include a
license statement.

 Of course, going one single unified transfer library is perhaps not the
 best thing from a software eco-system perspective, as competition tends
 to drive innovation and development, but the more users of a free
 software/open source project we get the better it will become.

Well, in the first place, ours isn't a library, so for the most part it
isn't really usable by other folks. :)

And there's still libwww from the W3C, at least (and probably others).

Besides, the great thing about the _free_ software eco-system, is that
even when there is only a single, unified library, as long as it is free
it can easily be forked to move in a new direction to meet differing
requirements. :)

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHKAQ47M8hyUobTrERCNZgAJ4rsG9ZlZuoHmvZBssE5oPGKY6yOACfRkc0
HEKiQEEbbs9IZWg3AwfyNII=
=kiF5
-END PGP SIGNATURE-


Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-10-30 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Josh Williams wrote:
 Although the code might
 suck for those trying to read it, I think it could be very great with
 a little regular maintenance.

Oh, I think it's probably already earned a reputation for greatness at
this point. But yeah, it needs some maintenance work. Which is, of
course, what I volunteered for in the first place :)

 There still remains the question, though, of whether version 2 will
 require a complete rewrite. Considering how fundamental these changes
 are, I don't think we would have much of a choice.

Right. The idea I... thought I had settled on, was to refactor what we
have, until it is sufficiently pliable to start adding some of the
version 2 features. If, OTOH, we're going to have two separate projects,
there's less motivation to try to slowly rework everything under the
sun; though there are obviously still sections that would benefit from
refactoring (gethttp and http_loop are currently still right in my
crosshairs).

 You mentioned that
 they could share code for recursion, but I don't see how. IIRC, the
 code for recursion in the current version is very dependent on the
 current methods of operation. It would probably have to be rewritten
 to be shared.

Yeah, the shared codebase would probably be pretty small. But the actual
logic about how to parse HTML, or whether or not to descend, or
comparing Web timestamps to local ones, should be sharable. But yes,
after a rewrite of the relevant code.

I don't think we'd have to make it happen, in particular; as we
discover common logic that can be factored, we'll just... do it.

 As for libcurl, I see no reason why not. Also, would these be two
 separate GNU projects? Would they be packaged in the same source code,
 like finch and pidgin?

Probably not packaged together. People who want the traditional Wget are
not gonna want to download the JavaScript and MetaLink support code. :\
We should keep it as tight as possible.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHKARG7M8hyUobTrERCGHUAJ9a8KP5QV05mZqy1PHhNU0WEjkp7wCbBiG1
qohy2y3OjJZnPT1ErfkkVHw=
=XXre
-END PGP SIGNATURE-


Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-10-27 Thread Daniel Stenberg

On Fri, 26 Oct 2007, Micah Cowan wrote:

The obvious solution to that is to use c-ares, which does exactly that: 
handle DNS queries asynchronously. Actually, I didn't know this until just 
now, but c-ares was split off from ares to meet the needs of the curl 
developers. :)


We needed an asynch name resolver for libcurl so c-ares started out that way, 
but perhaps mostly because the original author didn't care much for our 
improvements and bug fixes. ADNS is a known alternative, but we couldn't use 
that due to license restrictions. You (wget) don't have that same problem with 
it. I'm not able to compare them though, as I never used ADNS...


Of course, if we're doing asynchronous net I/O stuff, rather than reinvent 
the wheel and try to maintain portability for new stuff, we're better off 
using a prepackaged deal, if one exists. Luckily, one does; a friend of mine 
(William Ahern) wrote a package called libevnet that handles all of that;


When I made libcurl grok a vast number of simultaneous connections, I went 
straight with libevent for my test and example code. It's solid and fairly 
easy to use... Perhaps libevnet makes it even easier, I don't know.


Plus, there is the following thought. While I've talked about not 
reinventing the wheel, using existing packages to save us the trouble of 
having to maintain portable async code, higher-level buffered-IO and network 
comm code, etc, I've been neglecting one more package choice. There is, 
after all, already a Free Software package that goes beyond handling 
asynchronous network operations, to specifically handle asynchronous _web_ 
operations; I'm speaking, of course, of libcurl.


I guess I'm not the man to ask nor comment this a lot, but look what I found:

  http://www.mail-archive.com/wget@sunsite.dk/msg01129.html

I've always thought and I still believe that wget's power and most appreciated 
abilities are in the features it adds on top of the transfer, like HTML 
parsing, ftp list parsing and the other things you mentioned.


Of course, going one single unified transfer library is perhaps not the best 
thing from a software eco-system perspective, as competition tends to drive 
innovation and development, but the more users of a free software/open source 
project we get the better it will become.


Re: Thoughts on Wget 1.x, 2.0 (*LONG!*)

2007-10-26 Thread Josh Williams
On 10/26/07, Micah Cowan [EMAIL PROTECTED] wrote:
 And, of course, when I say there would be two Wgets, what I really
 mean by that is that the more exotic-featured one would be something
 else entirely than a Wget, and would have a separate name.

I think the idea of having two Wgets is good. I too have been
concerned about the resources required in creating the all-out version
2.0. The current code for Wget is a bit mangled, but I think the basic
concepts surrounding it are very good ones. Although the code might
suck for those trying to read it, I think it could be very great with
a little regular maintenance.

There still remains the question, though, of whether version 2 will
require a complete rewrite. Considering how fundamental these changes
are, I don't think we would have much of a choice. You mentioned that
they could share code for recursion, but I don't see how. IIRC, the
code for recursion in the current version is very dependent on the
current methods of operation. It would probably have to be rewritten
to be shared.

As for libcurl, I see no reason why not. Also, would these be two
separate GNU projects? Would they be packaged in the same source code,
like finch and pidgin?

I do believe the next question at hand is what version 2's official
mascot will be. I purpose Lenny the tortoise ;)

   _  ..
Lenny -  (_\/  \_,
'uuuu~'