[ 
https://issues.apache.org/jira/browse/NUTCH-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943856#comment-14943856
 ] 

Julien Nioche commented on NUTCH-2132:
--------------------------------------

bq.  but that locks us into using Kibana, etc. Ideally one goal of this would 
be to enable it to work with multiple downstream front ends

I mentioned logstash as an example, my point was more generally about 
leveraging the log files instead of modifying the code and possibly add 
overhead and complexity. There are probably other tools doing similar things.

Having said that Logstash is pluggable and supports various backends, it could 
be probably be possible to push things into a queue for instance.

Talking about dependencies, this introduces a hard one on RabbitMQ. Isn't there 
a neutral API that could be programmed against? (JMS? AMQP?) - this would allow 
users to chose their favourite messaging queue.


 

> Publisher/Subscriber model for Nutch to emit events 
> ----------------------------------------------------
>
>                 Key: NUTCH-2132
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2132
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher, REST_api
>            Reporter: Sujen Shah
>              Labels: memex
>             Fix For: 1.11
>
>         Attachments: NUTCH-2132.patch
>
>
> It would be nice to have a Pub/Sub model in Nutch to emit certain events (ex- 
> Fetcher events like fetch-start, fetch-end, a fetch report which may contain 
> data like outlinks of the current fetched url, score, etc). 
> A consumer of this functionality could use this data to generate real time 
> visualization and generate statics of the crawl without having to wait for 
> the fetch round to finish. 
> The REST API could contain an endpoint which would respond with a url to 
> which a client could subscribe to get the fetcher events. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to