yes, before re-fetch again, these old URIs will query by user. one solution is to delete these broken records by solr delete query using content id if you know all broken uri. another method is to change the fetch time of broken uri and re-fetch again. but currently does not support this functionality.
On Thu, Oct 3, 2013 at 8:18 AM, Bayu Widyasanyata <[email protected]>wrote: > Hi Feng, > > How about the existing 'records' stored on current Solr database? > Before that URL re-fetch again, then search result will refer to old URI > format (from old CMS). > User will direct to broken link URL. > > If I could delete those old records, it will force clean up old database. > Then I should recrawl and reindex as usual. > > Thanks, > > > On Wed, Oct 2, 2013 at 9:14 AM, feng lu <[email protected]> wrote: > > > Hi Bayu > > > > Nutch will set the status of that url to STATUS_DB_GONE if the url can > not > > fetch successful, and you run the bin/nutch solrclean command that nutch > > will remove the GONE documents from solr. > > > > > > On Wed, Oct 2, 2013 at 7:24 AM, Bayu Widyasanyata > > <[email protected]>wrote: > > > > > Hi, > > > > > > One of my seed URL was changed to new CMS which affect to its URI > > > presentation format. > > > > > > How could I delete the old format of CMS on Solr database, then I could > > > recrawl and reindex again with new URI format comes from new CMS? > > > > > > Thanks! > > > > > > -- > > > wassalam, > > > [bayu] > > > > > > > > > > > -- > > Don't Grow Old, Grow Up... :-) > > > > > > -- > wassalam, > [bayu] > -- Don't Grow Old, Grow Up... :-)

