Dear all,

I am trying to use Nutch as part of a Focused Crawler. 
In order to create this fc I need to find in which classes (source code of
course) :
(1) the actual parsing is done (A.HREF) and add set some conditional
statements (need to check for example surrounding text)
(2) the urls are added in the queue

I would appreciate any help in this matter.


Best,
Anastasia

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Class-in-the-code-that-handles-parsing-of-html-files-and-selection-of-URLs-tp3890250p3890250.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to