I've been using nutch to index production log files from a client application. It's been a great tool because we do get a large volume of logs from the field and often have to go through complicated pattern searches. Lately we're have some issues managing the our disk space. I noticed that nutch keeps all of the content in the segments content folder. Is there a reason all of the content is stored? I didn't see any obvious setting for just indexing and not keeping the content.
I do use the more search plugings to do filtering by date and url. Maybe these require the content in the content folders? Any help would be muchly appreciated. Roberto
_______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
