Thank you for your help! I have only allowed the files i want to index in conf/crawl-urlfilter.txt This seems to be the wrong file (because the crawler fetches nothing). Is there a page listing what conf file is for what?
Tobias Zahn Sami Siren schrieb: > Tobias Zahn wrote: >> Hallo again, >> I think I'm going to have a problem here: what if I'd like to index only >> files like .gif? I think I won't get anything in my index that way :-( >> Is there a way to get all URLs to such files anyway (maybe on a txt-list)? > > You would have to allow html to be fetched to find the images. You would > also need to change indexer to index just the content you are interested > in (images) and skip the rest. > > -- > Sami Siren > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
