Answer to my own question here:

http://lucene.472066.n3.nabble.com/How-to-install-or-use-nutch-patch-td20211
41.html

-----Original Message-----
From: Ralf R. Kotowski [mailto:[email protected]] 
Sent: Saturday, November 02, 2013 7:00 PM
To: [email protected]
Subject: RE: How to Crawl Specific sites

Thank you very much,

Excuse my ignorante, i'm not familiar on how to use Jira nor how to apply
patches... if someone could enlighten me, that would be great..

thnx

-----Original Message-----
From: Talat UYARER [mailto:[email protected]] 
Sent: Saturday, November 02, 2013 6:47 PM
To: [email protected]; Ralf R. Kotowski
Subject: RE: How to Crawl Specific sites

Hi Raph,
You can find NUTCH-1661 in jira. i uploaded today :)

Talat

Sent with AquaMail for Android
http://www.aqua-mail.com


On 2 Kasım 2013 19:10:04 "Ralf R. Kotowski" <[email protected]> wrote:
> Would you be willing to share this code?
>
> Thnx
>
> -----Original Message-----
> From: Talat UYARER [mailto:[email protected]] Sent: Tuesday, October

> 15, 2013 5:15 PM
> To: [email protected]
> Subject: Re: How to Crawl Specific sites
>
> Hi,
> In addition to Markus answer If you dont want to fetch again non Indıan 
> website, You can do it by writing some custom code. Actually We wrote code

> because of same needs. Normally if your websites mixed, like .com or .in, 
> you dont understand website language from the url. We solve this by
writing 
> custom FetchSchedular code. We check their languages in its shouldfetch 
> method. If website language is not allowed. We dont generate again.  If
you 
> want to wait I will share our code.
>
> Talat
>
> 15-10-2013 13:36 tarihinde, Markus Jelsma yazdı:
> > Hi - either by using a language detector that only allows some or all
> common languages spoken in India or by using a domain URL filter to
restrict
> to the .in domain.
> >  -----Original message-----
> >> From:Jayadeep Reddy <[email protected]>
> >> Sent: Tuesday 15th October 2013 12:10
> >> To: [email protected]
> >> Subject: How to Crawl Specific sites
> >>
> >> How can I index data of only Indian websites
> >>
> >> -- Jayadeep Reddy.S,
> >> M.D & C.E.O
> >> e Health Access Pvt.Ltd
> >> www.ehealthaccess.com
> >> Hyderabad-Chennai-Banglore
> >> http://www.youtube.com/watch?v=0k5LX8mw6Sk
> >>
>
>




Reply via email to