Hello all, I need to remove from nutch some urls which are marked with status "db_gone".
I already removed these urls from crawldb using: - I have specified a filter in regex-urlfilter.txt to remove these urls. - bin/nutch mergedb crawl/crawldb2 crawl/crawldb -filter - mv crawl/crawldb2 crawl/crawldb What I want to know is if should I remove this urls from anywhere else.(exp: should do anything with linkdb or segments? ) Thanks in advance, Marseld Dedgjonaj <p class="MsoNormal"><span style="color: rgb(31, 73, 125);">Gjeni <b>Punë të Mirë</b> dhe <b>të Mirë për Punë</b>... Vizitoni: <a target="_blank" href="http://www.punaime.al/">www.punaime.al</a></span></p> <p><a target="_blank" href="http://www.punaime.al/"><span style="text-decoration: none;"><img width="165" height="31" border="0" alt="punaime" src="http://www.ikub.al/images/punaime.al_small.png" /></span></a></p>

