thanks @Sebastian but that didnt help either. I think that is not right way to push on different core.
On Fri, Dec 27, 2019 at 5:10 PM Sebastian Nagel <wastl.na...@googlemail.com.invalid> wrote: > Hi, > > the test compares names of the "host" and the registered domain: > doc.getFieldValue('host')=='urgenthomework.com' > > The host name is "www.urgenthomework.com". You can test it via: > > $> bin/nutch indexchecker https://www.urgenthomework.com/ > fetching: https://www.urgenthomework.com/ > ... > host : www.urgenthomework.com > ... > title : Homework Help for College, University and School Students > ... > > Best, > Sebastian > > > On 12/26/19 11:29 AM, Zara Parst wrote: > > Hi, Is it possible to crawl three different website like > > > > 1. https://www.urgenthomework.com/ > > 2. https://www.myassignmenthelp.net/ > > 3. https://www.assignmenthelp.net/ > > > > in single nutch configuration and then send the respective index pages to > > corrosponding cores [ uah, mah , yah] in solr. I tried to acheieve it > by > > exchange and writer id. Please look below for my confirgurations > > > > -------------exchange.xml--------------------------------- > > > > > > > > > > > > > > > > *<exchange id="uahIndexernew" class="default"> <writers> <writer > > id="indexer_solr_1" /> </writers> <params> <param name="expr" > > value="doc.getFieldValue('host')=='urgenthomework.com > > <http://urgenthomework.com>'" /> </params> </exchange>* > > > > > > > > > > > > > > > > > > *<exchange id="mahIndexernew" class="default"> <writers> <writer > > id="indexer_solr_2" /> </writers> <params> <param name="expr" > > value="doc.getFieldValue('host')=='myassignmenthelp.net > > <http://myassignmenthelp.net>'" /> </params> </exchange>* > > > > > > > > > > > > > > > > > > > > > > * <exchange id="yahIndexernew" class="default"> <writers> <writer > > id="indexer_solr_3" /> </writers> <params> <param name="expr" > > value="doc.getFieldValue('host')=='assignmenthelp.net > > <http://assignmenthelp.net>'" /> </params> </exchange>* > > > > > > > > > ---------------------------------index.writers.xml---------------------------------------- > > > > <writer id="indexer_solr_1" > > class="org.apache.nutch.indexwriter.solr.SolrIndexWriter"> > > <parameters> > > <param name="type" value="http" /> > > <param name="url" value="http://localhost:8983/solr/uah" /> > > <param name="collection" value="" /> > > <param name="weight.field" value="" /> > > <param name="commitSize" value="1000" /> > > <param name="auth" value="false" /> > > <param name="username" value="username" /> > > <param name="password" value="password" /> > > </parameters> > > <mapping> > > <copy> > > <!-- <field source="title" dest="content" /> > > <field source="metatag.description" dest="content" /> > > <field source="metatag.keywords" dest="content" /> --> > > </copy> > > <rename></rename> > > <remove> > > <field source="segment" /> > > <field source="host" /> > > <field source="url" /> > > <!-- <field source="metatag.description" /> > > <field source="metatag.keywords" /> > > <field source="date" /> > > <field source="url" /> > > --> > > </remove> > > </mapping> > > </writer> > > > > > > <writer id="indexer_solr_2" > > class="org.apache.nutch.indexwriter.solr.SolrIndexWriter"> > > <parameters> > > <param name="type" value="http" /> > > <param name="url" value="http://localhost:8983/solr/mah" /> > > <param name="collection" value="" /> > > <param name="weight.field" value="" /> > > <param name="commitSize" value="1000" /> > > <param name="auth" value="false" /> > > <param name="username" value="username" /> > > <param name="password" value="password" /> > > </parameters> > > <mapping> > > <copy> > > </copy> > > <rename></rename> > > <remove> > > <field source="segment" /> > > <field source="host" /> > > <field source="url" /> > > </remove> > > </mapping> > > </writer> > > > > > > > > <writer id="indexer_solr_3" > > class="org.apache.nutch.indexwriter.solr.SolrIndexWriter"> > > <parameters> > > <param name="type" value="http" /> > > <param name="url" value="http://localhost:8983/solr/yah" /> > > <param name="collection" value="" /> > > <param name="weight.field" value="" /> > > <param name="commitSize" value="1000" /> > > <param name="auth" value="false" /> > > <param name="username" value="username" /> > > <param name="password" value="password" /> > > </parameters> > > <mapping> > > <copy> > > </copy> > > <rename></rename> > > <remove> > > <field source="segment" /> > > <field source="host" /> > > <field source="url" /> > > </remove> > > </mapping> > > </writer> > > > > > --------------------------------------------------------------------------------------------------------------- > > > > But it is not pushing data into corrosinding cores rather it is sending > > data in one core from different domain, Please do let me know. I am sure > > there has to be way to achieve it. I didnt try wth sobcollecion.xml. Do > you > > think I can achieve it using subcollection? > > > >