Hi all. I need to get and index response time for each url that nutch crawl. I have added a responseTime field in solr for this value.
Is there any way to do this with configurations only or i need to do my own plugin to extract this key from crawl datum "_rs_" ? Please some help about the steps will be apprecciated. Im have configured http.store.responsetime property to true, what im missing ?. This is my nutch-site.xml property <property> <name>http.store.responsetime</name> <value>true</value> <description>Enables us to record the response time of the host which is the time period between start connection to end connection of a pages host. The response time in milliseconds is stored in CrawlDb in CrawlDatum's meta data under key "_rs_" </description> </property> after i have put the key but when i do parsechecker i don´t see data related to responseTime in the output. <property> <name>db.parsemeta.to.crawldb</name> <value>"_rs_"</value> <description>Comma-separated list of parse metadata keys to transfer to the crawldb (NUTCH-779). Assuming for instance that the languageidentifier plugin is enabled, setting the value to 'lang' will copy both the key 'lang' and its value to the corresponding entry in the crawldb. </description> </property> La @universidad_uci es Fidel. Los jóvenes no fallaremos. #HastaSiempreComandante #HastalaVictoriaSiempre

