Hi All,

I am working with  nutch 1.1 . I have modified nutch 1.1 to suit my
purpose.  I have implemented custom query filters to handle some
specific functionality . In a sense i am not using any of the query filters
that comes as a bundle in the nutch plugins .
     Now i am faced with a major problem . In the NutchBean there is a
search (.. ... ) method which  removes duplicate hits from the same site &
restricts it to 2 by default . What happens is that when ever i fire a
query  from the command line i find  a Query Exception saying that "unknown
field name null" .This is becoz  a null field is getting added on re-firing
the query.
 When i explicitly add "site" field , it gets stuck in an infinite loop and
the query gets fired continuously .

When i disable this duplicate removal checking by commenting out the lines
,  everything works just fine , however the problem is multiple hits from
the same site is shown .

 Can any one throw some light on this particular method , what it actually
does  and how can i solve this problem .


Thanks & Regards,

Parnab Chanda
Research Scholar
IIT Kharagpur

Reply via email to