You should modify "regex-urlfilter" file.
Try searching the forum with "regex" search string - there are plenty of
topics regarding this


Tobias Zahn wrote:
> 
> Good evening everybody!
> I have looked up Google, the FAQs and so on but I didn't find anything
> on how to get only some types of files indexed (e.g. every file ending
> on .php and .htm). Is there a way to do this?
> 
> It would be also helpfull for me, if it was possible to get a list of
> all indexed urls of this filetypes.
> 
> TIA,
> Tobias Zahn
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Indexing-only-some-filetypes-with-Nutch-tf3049622.html#a8478903
Sent from the Nutch - User mailing list archive at Nabble.com.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to