On 08/07/2012 02:42 PM, Fernando Cassia wrote: > On Tue, Aug 7, 2012 at 3:08 PM, Micah Cowan <[email protected]> wrote: >> I think the maintainer is aware that Wget's code quality is poor, and >> would welcome sweeping architectural changes; I know I would have, when >> I was maintainer. > > Just an idea... why not "fork" it, call it "wget-NG" (Next Generation > ;), and develop it in parallel. When/if the "brand new, nifty, easier > to maintain, completelly cool design" next-generation turns out to be > as stable and a drop-in replacement for the older -and judged as such > by the community- then the community itself will switch to ´wget-ng´ > (or ´wgetr2´), and at that point the old code base can stop being > maintained...
That's actually what I'm basically doing right now, though I've had scant time for it just recently. http://niwt.addictivecode.org/. Many of the ideas I'm including (or will include) in Niwt were originally specifically framed as ideas for a "Wget 2.0" or what have you. However, it takes ideas in a different enough direction that it's not a simple "competing project X is better, so use that instead of wget" decision. Wget is monolithic, portable to non-Unix, written entirely in C, and can be built to have few dependencies. My "Niwt" project has the specific aims to be as hackable and behavior-changeable as possible, and to be modular in the traditional Unix style (a composition of many smaller utilities, each of which does "one thing well"), at the cost of such things as resource consumption and efficiency (especially), and being tied inextricably to Unix. Also, since it's essentially a big pipeline of many parts, more moving parts generally means more things that "can go wrong". There are definite trade-offs, and which project will be better for a user depends very greatly on what their requirements are. My next big sweep in Niwt is meant to rewrite the core engine to be significantly more efficient (Niwt basically constructs a shell pipeline and then evaluates it; the pipeline will always be shell, or at the very least shell-like, but the bit that CONSTRUCTS the pipeline doesn't have to be shell, and currently is). But efficiency will always necessarily suffer when the fact that several processes copy information from one process to another (as all pipelines do), and forking can happen for every HTTP request/response (depending on what options are used). The plus side is that this sort of extremely modular architecture allows you to plug in whatever functionality you want. Imagine rendering image content as it's being downloaded onto disk, so you can preview what's going on. Or recursing through HTTP links found in PDF files as well as HTML files. Or transforming JPGs to PNGs on the fly, before saving. The possibilities are endless. > And by the way, thanks for the response Micah. I don´t want to know > who´s behind every email, as long as the FSF knows who it´s dealing > with. I wasn´t aware that paperwork was required. Then I guess it´s > OK. > > I was just concerned since wget is too ubiquitous and becomes easy > target for nefarious sources to inject vulnerabilities into it... Well, IMO, no project ought to be so lax in its code review policies that such a thing should be possible. The best means to avoiding security vulnerabilities, IMO, is (A) to actually look at the code that comes in, to see that it's as it should be, and (B) make sure all the code is readable enough that (A) is possible (and hopefully easy). illusionoflife's proposal is to help us with B, which would be to some degree counterproductive if malicious intent was harbored. :) -mjc
