Hi Byzen, I understand you have commented image suffix from regex filter. Can you share your nutch-site.xml, regex filter also if you have changed anything in nutch-default.xml
-- Thanks Madhav Sharan On Mon, Nov 30, 2015 at 10:15 PM, Baizhang Ma <[email protected]> wrote: > Hi, everyone. > I'm a new nutch user and now i want to crawl images from webpages. Now i > have excluded images suffix like gif|GIF|jpg|JPG|png|PNG|jpeg|JPEG|bmp|BMP > in the regex-urlfilter.txt, but it does not work. And my nutch version is > 2.2.1, is there anyone kindly to tell me how to do it? If I need to use a > plugin, could you tell me what plugin I need and how to configure it as I > am quite inexperience about this. Thank you very much. > > Best regards, > Byzen. Ma >

