i would like to have a segment filter, which filters out unneeded content. i only want to keep the content of pages which are still indexed in solr and which belong to this segment, when i query solr by this segment name. is there any existing tool available?
SegmentMerger is a no go for me. it needs too much resources and it changes the segment name, i have then to reindex the documents. i dont wanna do this, because old content then overwrites newer content in index/solr. i only want to reduce the disc footprint of segments and nutch. regards