Hi there, How do I remove myself (for now) from dev and user lists ? I could not figure it out.
On Thu, Oct 28, 2010 at 5:22 AM, Andrzej Bialecki (JIRA) <[email protected]>wrote: > > [ > https://issues.apache.org/jira/browse/NUTCH-880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] > > Andrzej Bialecki updated NUTCH-880: > ------------------------------------ > > Summary: REST API for Nutch (was: REST API (and webapp) for Nutch) > > The webapp part is tracked now in NUTCH-929. > > > REST API for Nutch > > ------------------ > > > > Key: NUTCH-880 > > URL: https://issues.apache.org/jira/browse/NUTCH-880 > > Project: Nutch > > Issue Type: New Feature > > Affects Versions: 2.0 > > Reporter: Andrzej Bialecki > > Assignee: Andrzej Bialecki > > Attachments: API-2.patch, API.patch > > > > > > This issue is for discussing a REST-style API for accessing Nutch. > > Here's an initial idea: > > * I propose to use org.restlet for handling requests and returning > JSON/XML/whatever responses. > > * hook up all regular tools so that they can be driven via this API. This > would have to be an async API, since all Nutch operations take long time to > execute. It follows then that we need to be able also to list running > operations, retrieve their current status, and possibly > abort/cancel/stop/suspend/resume/...? This also means that we would have to > potentially create & manage many threads in a servlet - AFAIK this is > frowned upon by J2EE purists... > > * package this in a webapp (that includes all deps, essentially nutch.job > content), with the restlet servlet as an entry point. > > Open issues: > > * how to implement the reading of crawl results via this API > > * should we manage only crawls that use a single configuration per > webapp, or should we have a notion of crawl contexts (sets of crawl configs) > with CRUD ops on them? this would be nice, because it would allow managing > of several different crawls, with different configs, in a single webapp - > but it complicates the implementation a lot. > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > >

