How to check URL that have been indexed by Solr?

Bayu Widyasanyata Mon, 17 Feb 2014 15:03:06 -0800

Hi,

Sometimes we accidentally crawls unneeded URLs format until push them into
last "solrindex" step.


As we know we can drop or delete those URLs by add regex on
regex-urlfilter.txt and do "nutch updatedb". Then those URL will be
dropped/deleted from crawldb database.

But, how to ensure URLs that have been indexed by Solr ("nutch solrindex")
before we do "nutch updatedb" has also deleted?
Does the URL is also deleted when we perform "solrindex" again?

Thank you.-

-- 
wassalam,
[bayu]

How to check URL that have been indexed by Solr?

Reply via email to