[ 
https://issues.apache.org/jira/browse/NUTCH-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13666616#comment-13666616
 ] 

Luca Cavanna commented on NUTCH-1527:
-------------------------------------

I just ran into this issue and thought it would be nice if nutch supported 
elasticsearch out-of-the-box. I had a look at the code and saw a few things 
that I would do differently:
- You can use the BulkProcessor instead of manually having to create the 
BulkRequest and handle it. It'll automatically execute the bulk when needed and 
it's also really flexible and configurable. That way you would be able to 
remove a lot of boilerplate code.
- I know the multicast discovery is fancy, that like you do now you don't need 
to specify any url and the client node will join an existing cluster with same 
name, but I think I would go for the other type of client here, the 
TransportClient, which is more lightweight and just sends requests to the 
configured urls in a round-robin fashion, using the internal binary protocol 
that elasticsearch uses for inter-node communication.

Let me know if I can help more, I'm certainly willing to get my hands dirty 
here if you want ;)
                
> Port nutch-elasticsearch-indexer to Nutch
> -----------------------------------------
>
>                 Key: NUTCH-1527
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1527
>             Project: Nutch
>          Issue Type: Bug
>          Components: indexer
>    Affects Versions: 1.6, 2.1
>            Reporter: Lewis John McGibbney
>            Assignee: lufeng
>            Priority: Minor
>             Fix For: 2.4
>
>         Attachments: NUTCH-1527.patch
>
>
> The source repos for this can be found here [0].
> This issue should be inline with the work already done by Julien and others 
> over at NUTCH-1047.
> [0] https://github.com/ctjmorgan/nutch-elasticsearch-indexer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to