Hi all.
Nutch has a configuration that permit save responseTime for every url that is 
fetched, and this value is stored in crawl Datum under the key _rs_ but not 
indexed.
Will be very usefull to index this value also.
This value is very important in all cases and it is very easy to make this 
native in nutch.
A little change to index basic plugin (or other) can make this happend.


//index responseTime for each url if http.store.responsetime is true
    boolean property= conf.getBoolean("http.store.responsetime",true);
    if (property == true){
      String value=datum.getMetaData().get(new Text("_rs_")).toString();
      doc.add("responseTime",value);
    }

I can do the jira ticket ant patch for this.
What you think about it ?
La @universidad_uci es Fidel. Los jóvenes no fallaremos.
#HastaSiempreComandante
#HastalaVictoriaSiempre

Reply via email to