[ 
https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sujen Shah updated NUTCH-2132:
------------------------------
    Description: 
It would be nice to have a Pub/Sub model in Nutch to emit certain events (ex- 
Fetcher events like fetch-start, fetch-end, a fetch report which may contain 
data like outlinks of the current fetched url, score, etc). 

A consumer of this functionality could use this data to generate real time 
visualization and generate statics of the crawl without having to wait for the 
fetch round to finish. 

The REST API could contain an endpoint which would respond with a url to which 
a client could subscribe to get the fetcher events. 

  was:
It would be nice to have a Pub/Sub model in Nutch to emit certain events (ex- 
Fetcher events). 
A consumer of this functionality could use this data to generate real time 
visualization and generate statics of the crawl without having to wait for the 
fetch round to finish. 

The REST API could contain an endpoint which would respond with a url to which 
a client could subscribe to get the fetcher events. 


> Publisher/Subscriber model for Nutch to emit events 
> ----------------------------------------------------
>
>                 Key: NUTCH-2132
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2132
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher, REST_api
>            Reporter: Sujen Shah
>              Labels: memex
>             Fix For: 1.11
>
>         Attachments: NUTCH-2132.patch
>
>
> It would be nice to have a Pub/Sub model in Nutch to emit certain events (ex- 
> Fetcher events like fetch-start, fetch-end, a fetch report which may contain 
> data like outlinks of the current fetched url, score, etc). 
> A consumer of this functionality could use this data to generate real time 
> visualization and generate statics of the crawl without having to wait for 
> the fetch round to finish. 
> The REST API could contain an endpoint which would respond with a url to 
> which a client could subscribe to get the fetcher events. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to