Hi Lewis,

The SolrClean utility is working fine.

The problem was on my side, i.e. I did the initial crawling with a crawl_id,
but this id was not picked when running solr clean (I didn't have this id in
nutch-site.xml).

I found this out when I got to the StorageUtils.java and saw how the web
store was created.

    String crawlId = conf.get(Nutch.CRAWL_ID_KEY, "");
    
    if (!crawlId.isEmpty()) {
      conf.set("schema.prefix", crawlId + "_");
    } else {
      conf.set("schema.prefix", "");
    }

This was the reason the map method in the CleanMapper was not called, as the
web store was empty (was using the one without prefix which didn't exist).

I have now added the crawl_id to nutch-site.xml, so the correct web store is
used when doing a solr clean.

Claudiu.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrClean-not-available-in-nutch-2-x-tp4081385p4081757.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to