Thanks Lewis, It was very basic mistake on my part. Default crawl script launches generateJob with -noFilter switch which I failed to take notice of. Rest of the configurations and job file were fine. Your reply was indeed helpful to bootstrap debugging.
-Gajanan On Thu, Oct 11, 2018 at 9:39 PM lewis john mcgibbney <lewi...@apache.org> wrote: > Hi Gajanan, > Seeing as you are using 2.x, are you making sure that the project has been > built with the correct regex-urlfilter.txt being present on ClassPath and > included in the job jar you are using? > > On Thu, Oct 11, 2018 at 12:19 AM <user-digest-h...@nutch.apache.org> > wrote: > > > > > > > From: Gajanan Watkar <gajananwat...@gmail.com> > > To: user@nutch.apache.org > > Cc: > > Bcc: > > Date: Wed, 10 Oct 2018 17:19:24 +0530 > > Subject: Re: Unable to get regex-urlfilter working > > I am using Nutch 2.x with habse as backend storage. > > > > *-Gajanan* > > >