[jira] [Created] (NUTCH-1286) Refactoring/reimplementing crawling API (NutchApp)

Ferdy Galema (Created) (JIRA) Mon, 20 Feb 2012 06:56:01 -0800

Refactoring/reimplementing crawling API (NutchApp)
--------------------------------------------------


                 Key: NUTCH-1286
                 URL: https://issues.apache.org/jira/browse/NUTCH-1286
             Project: Nutch
          Issue Type: Improvement
          Components: administration gui, REST_api, web gui
            Reporter: Ferdy Galema


This issue is to track changes we (Mathijs and I) have planned for the API and 
webapp in Nutchgora. We have a pretty good idea of how we want to be using the 
crawl API. It may involve some major refactoring or perhaps a side 
implementation next the current NutchApp functionality. It depends on how much 
we can reuse the existing components. The bottom line is that there will be a 
strictly defined Java API that provide everyting related from crawling/indexing 
to job control. (Listing jobs, tracking progress and aborting jobs being part 
of it). There will be no server or service for tracking crawling states, all 
will be persisted one way or the other and queryable from the API. The REST 
server shall be a very thin layer on top of the Java implementation. A rich web 
interface will be very easy layer too, once we have a cleanly (but extensive) 
defined API. But we will start to make to API usable from a simple command-line 
interface.

More details will be provided later on.. feel free to comment if you have 
suggestions/questions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (NUTCH-1286) Refactoring/reimplementing crawling API (NutchApp)

Reply via email to