Hi,
There is an open thread on the user list for this right now. Please look in
the recent archive.
I think it would be best to take this conv over there.
Lewis

On Thursday, June 20, 2013, Jamshaid Ashraf <[email protected]> wrote:
> Hi All,
>
> I'm using Nutch 2.x/Cassandra and I have 3 urls in seed list:
>
> www.news1.com
> www.news2.com
> www.news3.com
>
> Now I wanted to parse and extract html of those links. In order to full
> fill my requirement I have written a parse filter plugin.
>
> The problem is that following line "page.getContent().array()" return the
> html of above 3 sites each time nutch parsefilter's @overide parse filter
> is called for sites in seed list.
>
> Thanks in advance!
> Jamshaid
>

-- 
*Lewis*

Reply via email to