Try this:

<property>
  <name>index.db.md</name>
  <value></value>
  <description>
     Comma-separated list of keys to be taken from the crawldb metadata to 
generate fields.
     Can be used to index values propagated from the seeds with the plugin 
urlmeta 
  </description>
</property>

And enable index-metadata (iirc) plugin, you are good to go!

Cheers,
Markus

 
 
-----Original message-----
> From:Eyeris Rodriguez Rueda <[email protected]>
> Sent: Monday 6th February 2017 15:56
> To: [email protected]
> Subject: make responseTime native in nutch
> 
> Hi all.
> Nutch has a configuration that permit save responseTime for every url that is 
> fetched, and this value is stored in crawl Datum under the key _rs_ but not 
> indexed.
> Will be very usefull to index this value also.
> This value is very important in all cases and it is very easy to make this 
> native in nutch.
> A little change to index basic plugin (or other) can make this happend.
> 
> 
> //index responseTime for each url if http.store.responsetime is true
>     boolean property= conf.getBoolean("http.store.responsetime",true);
>     if (property == true){
>       String value=datum.getMetaData().get(new Text("_rs_")).toString();
>       doc.add("responseTime",value);
>     }
> 
> I can do the jira ticket ant patch for this.
> What you think about it ?
> La @universidad_uci es Fidel. Los jóvenes no fallaremos.
> #HastaSiempreComandante
> #HastalaVictoriaSiempre
> 
> 

Reply via email to