Hey
U can put a condition in the Generator phase to chk the file size and
decide whether to index or not
Tjabring van Egten wrote:
Hi,
Is there a way to avoid Nutch to fetch files that are over a certain size?
I see that I can truncate these files that above a certain amount of bytes
in the normal configuration, but I would like to prevent that they are fetch
and index at all.
I'm using Nutch 0.8.1. Any help is welcome!
Kind regards,
T. vanEgten
--
Always vizz it us @ visvo.com
--
This message has been scanned for viruses and
dangerous content and is believed to be clean.