Thank you for your help!
I have only allowed the files i want to index in
conf/crawl-urlfilter.txt
This seems to be the wrong file (because the crawler fetches nothing).
Is there a page listing what conf file is for what?


Tobias Zahn


Sami Siren schrieb:
> Tobias Zahn wrote:
>> Hallo again,
>> I think I'm going to have a problem here: what if I'd like to index only
>> files like .gif? I think I won't get anything in my index that way :-(
>> Is there a way to get all URLs to such files anyway (maybe on a txt-list)?
> 
> You would have to allow html to be fetched to find the images. You would
> also need to change indexer to index just the content you are interested
> in (images) and skip the rest.
> 
> --
>  Sami Siren
> 


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to