Hi,

I just finished reading all source code about nutch gui. And
personally i don't like putting a lot of code snippets into jsp files
since it takes a lot time when refactoring. So how about to adopt
using velocity/freemarker with servlet?

On 1/17/07, Enis Soztutar <[EMAIL PROTECTED]> wrote:
Hi all, for NUTCH-251:

I suppose that NUTCH-251 is relatively a significant issue by the votes.
Stafan has written a good plugin for the admin gui and i have updated it
to work with nutch-0.8, hadoop 0.4.

Some of the features in the patch is not appropriate for our use cases
and it requires hadoop changes, thus I am currently working on an
alternative implementation of the administration gui, which runs a
hadoop server( like JobTraker) to listen to submitted Jobs, an web Gui
to submit and track the jobs from the browser and a job runner.

The architechture details of the patch is as follows :

  - An interface AdminJob which is an abstract class representing a Job
in nutch.
  - various classes extending AdminJob. for ex FetchAdminJob, IndexAdminJob.
  - A queue which sorts the jobs in priority order, by a modified a
topological sort(jobs can be dependent).
  - an interface to submit Jobs
  - a rpc server to listen to job submissions
  - an extension point (basically same as the previous)
  - a web server to serve plugin jsp's

upon the features will be
    - submitting jobs from code, command line or web interface,
    - tracking jobs from the command line or web interface
    - scheduling jobs

I could send the code or details if anyone is interested in pretesting.
And i will appreciate any comments and suggestions on this. I am
planning to complete the patch and submit it to Jira ASAP.

Sami Siren wrote:
> Hello,
>
> It has been a while from a previous release (0.8.1) and looking at the
> great fixes done in trunk I'd start thinking about baking a new release
> soon.
>
> Looking at the jira roadmaps there are 1 blocking issues (fixing the
> license headers) for 0.8.2 and two other blocking issues for 0.9.0 of
> which I think NUTCH-233 is safe to put in.
>
> The top 10 voted issues are currently:
>
> NUTCH-61       Adaptive re-fetch interval. Detecting umodified content
> NUTCH-48      "Did you mean" query enhancement/refignment feature
> NUTCH-251     Administration GUI
> NUTCH-289     CrawlDatum should store IP address
> NUTCH-36      Chinese in Nutch
> NUTCH-185     XMLParser is configurable xml parser plugin.            
NUTCH-59        meta
> data support in webdb
> NUTCH-92      DistributedSearch incorrectly scores results            
NUTCH-68        A
> tool to generate arbitrary fetchlists                 NUTCH-87        
Efficient
> site-specific crawling for a large number of sites
>
> Are there any opinions about issues that should go in before the next
> release (Answering yes means that you are willing to provide a patch for
> it).
>
> --
>  Sami Siren
>
>


Reply via email to