> > If there's more important and/or interesting things for me to work on, > I'll be glad to. I'm completely unfamiliar with the current state of the > project as a whole - and looking through JIRA is a bit daunting. The only > reason I'm attracted to working on the fetcher is I think it's a really > interesting and compelling problem to solve, and it's making it more > flexible is something that would directly benefit our use for it, so it > will be easier to devote time to it while I'm at the office. I do have a > glut of free time at the moment though, so I'm perfectly okay working on > another area that's more pressing - I just don't know what it is. I saw > that protocol-httpclient needs to be rewritten, is there someone working on > that? >
not that I am aware of. > > I can work on more important and less controversial / radical things, but > I do think that having a more flexible, pluggable fetcher will be an > enormous improvement to Nutch and can greatly expand the potential uses for > it as a piece of software. There's a ton of cases where pluggable fetching > could have a huge improvement: local filesystem search, single-threaded / > small site indexing, email indexing (SMTP, POP, etc.), etc. > isn't this already done at the protocol level? > I suggested an extremely (perhaps too much so) abstract archtecture for > fetching in ticket #1201, and for the sake of brevity I won't repeat myself > here, but I think that would give Nutch a good base for flexible fetching, > which I believe is a huge improvement to the project. I'm obviously new to > the development here and I'm willing do whatever needs doing, I just > believe the fetching is something that needs doing. I just want to > contribute! > you are of course free to work on anything you want and your contribution would be more than welcome. I just reacted to Lewis' comments because I did not want people to have the impression that the Fetcher was broken + I also see more urgent and useful things to do but that's just my personal views. Thanks Julien -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com

