Refactoring/reimplementing crawling API (NutchApp)
--------------------------------------------------
Key: NUTCH-1286
URL: https://issues.apache.org/jira/browse/NUTCH-1286
Project: Nutch
Issue Type: Improvement
Components: administration gui, REST_api, web gui
Reporter: Ferdy Galema
This issue is to track changes we (Mathijs and I) have planned for the API and
webapp in Nutchgora. We have a pretty good idea of how we want to be using the
crawl API. It may involve some major refactoring or perhaps a side
implementation next the current NutchApp functionality. It depends on how much
we can reuse the existing components. The bottom line is that there will be a
strictly defined Java API that provide everyting related from crawling/indexing
to job control. (Listing jobs, tracking progress and aborting jobs being part
of it). There will be no server or service for tracking crawling states, all
will be persisted one way or the other and queryable from the API. The REST
server shall be a very thin layer on top of the Java implementation. A rich web
interface will be very easy layer too, once we have a cleanly (but extensive)
defined API. But we will start to make to API usable from a simple command-line
interface.
More details will be provided later on.. feel free to comment if you have
suggestions/questions.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira