Re: Apache Nutch vs Multiple elasticsearch nodes

2018-11-28 Thread lewis john mcgibbney
Hi Marcello,
I don't think this is correct no!
first however, I really suggest that we upgrade the Jest client in this
plugin. The most recent one is 6.3.1 and we are using 2.0.3.
Please see https://issues.apache.org/jira/browse/NUTCH-2677, if you are
able to provide a patch and test it out it would be great. Please see my
response inline below

On Wed, Nov 28, 2018 at 6:42 AM  wrote:

> From: Marcello Lorenzi 
> To: user@nutch.apache.org
> Cc:
> Bcc:
> Date: Wed, 28 Nov 2018 15:41:45 +0100
> Subject: Apache Nutch vs Multiple elasticsearch nodes
> Hi All,
> we installed the latest version of Apache Nutch to crawl some HTML pages
> but we tested all the operations with a single Elasticsearch instance. We
> use the Elasticsearch REST index writer but into the "host" parameter we
> configure the string "es-elk-pr01.test.local, es-elk-sv01.test.local" the
> JEST client has been started with only 1 server.
>
>  INFO AbstractJestClient:56 - Setting server pool to a list of 1 servers: [
> http://es-elk-pr01.test.local , es-elk-sv01.test.local:9200]
>
> Is it correct this behavior?
>

Please see the following message,
https://github.com/apache/nutch/blob/master/src/plugin/indexer-elastic-rest/src/java/org/apache/nutch/indexwriter/elasticrest/ElasticRestIndexWriter.java#L110
I think you should configure your host in index-writers.xml, specifically
see

https://github.com/apache/nutch/blob/master/conf/index-writers.xml.template#L127-L150

HTH
Lewis


>
> Thanks in advance,
> Marcello
>


-- 
http://home.apache.org/~lewismc/
http://people.apache.org/keys/committer/lewismc


Apache Nutch vs Multiple elasticsearch nodes

2018-11-28 Thread Marcello Lorenzi
Hi All,
we installed the latest version of Apache Nutch to crawl some HTML pages
but we tested all the operations with a single Elasticsearch instance. We
use the Elasticsearch REST index writer but into the "host" parameter we
configure the string "es-elk-pr01.test.local, es-elk-sv01.test.local" the
JEST client has been started with only 1 server.

 INFO AbstractJestClient:56 - Setting server pool to a list of 1 servers: [
http://es-elk-pr01.test.local , es-elk-sv01.test.local:9200]

Is it correct this behavior?

Thanks in advance,
Marcello