> Or, if you have experience with JSPs/GUI work, then I think there's this
> big open issue around improving the Nutch GUI, which would likely provide
> the most benefit to the most users. I haven't been following the current
> status, but I know that there have been periodic discussions, and I think
> 101tec did some work on this a while back (for a client), but I don't know
> if that's been contributed (or could be, for that matter).
>

A related issue is porting the REST-API from nutchgora to trunk (
https://issues.apache.org/jira/browse/NUTCH-880) which in turn could be
used by a GUI

J.



>
> -- Ken
>
> On Jan 21, 2012, at 8:17am, Edward Drapkin wrote:
>
>  On 1/21/2012 8:27 AM, Lewis John Mcgibbney wrote:
>
> Hi Julien,
>
>
>  There are 8 issues in trunk about the fetcher - some of them unrelated
>> to the Fetcher (NUTCH-827<https://issues.apache.org/jira/browse/NUTCH-827>/ 
>> Nutch-1193) with most of the others being improvements (
>> NUTCH-828 <https://issues.apache.org/jira/browse/NUTCH-828> / 
>> NUTCH-1079<https://issues.apache.org/jira/browse/NUTCH-1079>)
>> with possibly just a very few being real issues.
>
>
> This puts the whole discussion into much better context, thanks for
> pointing this out. Maybe I should have made it more clear, that I only
> filtered the fetcher issues on our Jira and I was simply modelling my
> discussion around that. You are completely correct though, it would be
> different if the fetcher was in a similar state to protocol-httpclient...
> which it is obviously not.
>
>
>> I am also concerned about getting too radical changes to such a core part
>> of the framework, especially when more pressing issues could be looked
>> after instead.
>
> +1
>
>
>> Having said that if someone can come up with an interesting proposal for
>> improving the Fetcher that would be very good, I would simply suggest that
>> we then have a separate implementation for that.
>>
> +1
>
>
>>
>>
>>  Ok with this in mind then, is there some guidance we can communicate to
> Eddie? He has specifically mentioned that he shares similar opinions wrt
> the fetcher being a core part of Nutch, radical changes etc, and I also
> share this point of view. He has also added that he doesn't want to spend
> the time changing material which we may or may not merge with trunk, this
> also makes perfect sense. Additionally Ken's comments emphasise that this
> has been somewhat attempted in the past and that lessons have been learned
> and the implementation we have cuts the mustard as is.
> Maybe we could nudge Eddie in the right direction, which would benefit
> both himself and the project over the next while, I think this was the most
> important point I was trying to emphasise, however looking over my original
> comment this was maybe not how it was written.
>
> Thanks
> Lewis
>
>
> If there's more important and/or interesting things for me to work on,
> I'll be glad to.  I'm completely unfamiliar with the current state of the
> project as a whole - and looking through JIRA is a bit daunting.  The only
> reason I'm attracted to working on the fetcher is I think it's a really
> interesting and compelling problem to solve, and it's making it more
> flexible is something that would directly benefit our use for it, so it
> will be easier to devote time to it while I'm at the office.  I do have a
> glut of free time at the moment though, so I'm perfectly okay working on
> another area that's more pressing - I just don't know what it is.  I saw
> that protocol-httpclient needs to be rewritten, is there someone working on
> that?
>
> I can work on more important and less controversial / radical things, but
> I do think that having a more flexible, pluggable fetcher will be an
> enormous improvement to Nutch and can greatly expand the potential uses for
> it as a piece of software.  There's a ton of cases where pluggable fetching
> could have a huge improvement: local filesystem search, single-threaded /
> small site indexing, email indexing (SMTP, POP, etc.), etc.  I suggested an
> extremely (perhaps too much so) abstract archtecture for fetching in ticket
> #1201, and for the sake of brevity I won't repeat myself here, but I think
> that would give Nutch a good base for flexible fetching, which I believe is
> a huge improvement to the project.  I'm obviously new to the development
> here and I'm willing do whatever needs doing, I just believe the fetching
> is something that needs doing.  I just want to contribute!
>
> Thanks,
> Eddie
>
>
> --------------------------
> Ken Krugler
> http://www.scaleunlimited.com
> custom big data solutions & training
> Hadoop, Cascading, Mahout & Solr
>
>
>
>
>


-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to