Such as the title, I want crawl a page with many urls, but only the ones in a specified div are meaningful to me. So I want to write a plugin to filter it, but I don't know which extension point should I choose.
The htmlparser filter can get the html content, but seems like process after the "add to fetch list" operation. And the urlfilter can control the fetch list, but I cant get the html content in it. Look forward to any helpful replies, thx.

