Hi Sebastian All of this is coming but the problem is,The content is not sent sent.Nothing is indexed to es. This is the output on debug level.
ElasticIndexWriter elastic.cluster : elastic prefix cluster elastic.host : hostname elastic.port : port (default 9200) elastic.index : elastic index command elastic.max.bulk.docs : elastic bulk index doc counts. (default 250) elastic.max.bulk.size : elastic bulk index length. (default 2500500 ~2.5MB) no modules loaded loaded plugin [org.elasticsearch.index.reindex.ReindexPlugin] loaded plugin [org.elasticsearch.join.ParentJoinPlugin] loaded plugin [org.elasticsearch.percolator.PercolatorPlugin] loaded plugin [org.elasticsearch.script.mustache.MustachePlugin] loaded plugin [org.elasticsearch.transport.Netty4Plugin] created thread pool: name [force_merge], size [1], queue size [unbounded] created thread pool: name [fetch_shard_started], core [1], max [8], keep alive [5m] created thread pool: name [listener], size [2], queue size [unbounded] created thread pool: name [index], size [4], queue size [200] created thread pool: name [refresh], core [1], max [2], keep alive [5m] created thread pool: name [generic], core [4], max [128], keep alive [30s] created thread pool: name [warmer], core [1], max [2], keep alive [5m] thread pool [search] will adjust queue by [50] when determining automatic queue size created thread pool: name [search], size [7], queue size [1k] created thread pool: name [flush], core [1], max [2], keep alive [5m] created thread pool: name [fetch_shard_store], core [1], max [8], keep alive [5m] created thread pool: name [management], core [1], max [5], keep alive [5m] created thread pool: name [get], size [4], queue size [1k] created thread pool: name [bulk], size [4], queue size [200] created thread pool: name [snapshot], core [1], max [2], keep alive [5m] node_sampler_interval[5s] adding address [{#transport#-1}{nNtPR9OJShWSW-ayXRDILA}{localhost}{ 127.0.0.1:9300}] connected to node [{tzfqJn0}{tzfqJn0sS5OPV4lKreU60w}{QCGd9doAQaGw4Q_lOqniLQ}{127.0.0.1}{ 127.0.0.1:9300}] IndexingJob: done On Wed, Feb 28, 2018 at 10:05 PM, Sebastian Nagel < wastl.na...@googlemail.com> wrote: > I never tried ES with Nutch 2.3 but it should be similar to setup as for > 1.x: > > - enable the plugin "indexer-elastic" in plugin.includes > (upgrade and rename to "indexer-elastic2" in 2.4) > > - expects ES 1.4.1 > > - available/required options are found in the log file (hadoop.log): > ElasticIndexWriter > elastic.cluster : elastic prefix cluster > elastic.host : hostname > elastic.port : port (default 9300) > elastic.index : elastic index command > elastic.max.bulk.docs : elastic bulk index doc counts. (default > 250) > elastic.max.bulk.size : elastic bulk index length. (default > 2500500 ~2.5MB) > > Sebastian > > On 02/28/2018 01:26 PM, Yash Thenuan Thenuan wrote: > > Yeah > > I was also thinking that > > Can somebody help me with nutch 2.3? > > > > On 28 Feb 2018 17:53, "Yossi Tamari" <yossi.tam...@pipl.com> wrote: > > > >> Sorry, I just realized that you're using Nutch 2.x and I'm answering for > >> Nutch 1.x. I'm afraid I can't help you. > >> > >>> -----Original Message----- > >>> From: Yash Thenuan Thenuan [mailto:rit2014...@iiita.ac.in] > >>> Sent: 28 February 2018 14:20 > >>> To: user@nutch.apache.org > >>> Subject: RE: Regarding Indexing to elasticsearch > >>> > >>> IndexingJob (<batchId> | -all |-reindex) [-crawlId <id>] This is the > >> output of > >>> nutch index i have already configured the nutch-site.xml. > >>> > >>> On 28 Feb 2018 17:41, "Yossi Tamari" <yossi.tam...@pipl.com> wrote: > >>> > >>>> I suggest you run "nutch index", take a look at the returned help > >>>> message, and continue from there. > >>>> Broadly, first of all you need to configure your elasticsearch > >>>> environment in nutch-site.xml, and then you need to run nutch index > >>>> with the location of your CrawlDB and either the segment you want to > >>>> index or the directory that contains all the segments you want to > >> index. > >>>> > >>>>> -----Original Message----- > >>>>> From: Yash Thenuan Thenuan [mailto:rit2014...@iiita.ac.in] > >>>>> Sent: 28 February 2018 14:06 > >>>>> To: user@nutch.apache.org > >>>>> Subject: RE: Regarding Indexing to elasticsearch > >>>>> > >>>>> All I want is to index my parsed data to elasticsearch. > >>>>> > >>>>> > >>>>> On 28 Feb 2018 17:34, "Yossi Tamari" <yossi.tam...@pipl.com> wrote: > >>>>> > >>>>> Hi Yash, > >>>>> > >>>>> The nutch index command does not have a -all flag, so I'm not sure > >>>>> what > >>>> you're > >>>>> trying to achieve here. > >>>>> > >>>>> Yossi. > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Yash Thenuan Thenuan [mailto:rit2014...@iiita.ac.in] > >>>>>> Sent: 28 February 2018 13:55 > >>>>>> To: user@nutch.apache.org > >>>>>> Subject: Regarding Indexing to elasticsearch > >>>>>> > >>>>>> Can somebody please tell me what happens when we hit the bin/nutc > >>>>>> index > >>>>> -all > >>>>>> command. > >>>>>> Because I can't figure out why the write function inside the > >>>>> elastic-indexer is not > >>>>>> getting executed. > >>>> > >>>> > >> > >> > > > >