Ah, i see. The docs are misleading, the quotes are not meant to be copied verbatim.
-----Original message----- > From:Eyeris Rodriguez Rueda <[email protected]> > Sent: Tuesday 31st January 2017 15:32 > To: [email protected] > Subject: Re: [MASSMAIL]how to index response time for a url ? > > thanks markus for help. > I have readed the description of this property(below) and it says that crawl > datum save that value, i thought that it was necesary to take responseTime > from it. > i will try using only _rs_ key. > > <property> > > <name>http.store.responsetime</name> > > <value>true</value> > > <description>Enables us to record the response time of the > > host which is the time period between start connection to end > > connection of a pages host. The response time in milliseconds > > is stored in CrawlDb in CrawlDatum's meta data under key "_rs_" > > </description> > > </property> > > > > > > > > ----- Mensaje original ----- > De: "Markus Jelsma" <[email protected]> > Para: [email protected] > Enviados: Martes, 31 de Enero 2017 9:55:10 > Asunto: RE: [MASSMAIL]how to index response time for a url ? > > I am not sure what is going on, but those HTML entities " certainly do > not belong there. _rs_ is good enough. Then you also need index-metadata, and > have the indexer add _rs_ to your index. > > <property> > <name>db.parsemeta.to.crawldb</name> > <value>"_rs_"</value> > <description>Comma-separated list of parse metadata keys to transfer to the > crawldb (NUTCH-779). > Assuming for instance that the languageidentifier plugin is enabled, > setting the value to 'lang' > will copy both the key 'lang' and its value to the corresponding entry in > the crawldb. > </description> > > > > -----Original message----- > > From:Eyeris Rodriguez Rueda <[email protected]> > > Sent: Tuesday 31st January 2017 14:32 > > To: [email protected] > > Subject: Re: [MASSMAIL]how to index response time for a url ? > > > > Please any body can help me or not? > > this is only happening to me ? > > > > ----- Mensaje original ----- > > De: "Eyeris Rodriguez Rueda" <[email protected]> > > Para: [email protected] > > Enviados: Domingo, 29 de Enero 2017 22:28:01 > > Asunto: [MASSMAIL]how to index response time for a url ? > > > > Hi all. > > I need to get and index response time for each url that nutch crawl. > > I have added a responseTime field in solr for this value. > > > > Is there any way to do this with configurations only or i need to do my own > > plugin to extract this key from crawl datum "_rs_" ? > > Please some help about the steps will be apprecciated. > > > > > > Im have configured http.store.responsetime property to true, what im > > missing ?. > > > > > > > > This is my nutch-site.xml property > > > > <property> > > <name>http.store.responsetime</name> > > <value>true</value> > > <description>Enables us to record the response time of the > > host which is the time period between start connection to end > > connection of a pages host. The response time in milliseconds > > is stored in CrawlDb in CrawlDatum's meta data under key "_rs_" > > </description> > > </property> > > > > after i have put the key but when i do parsechecker i don´t see data > > related to responseTime in the output. > > > > <property> > > <name>db.parsemeta.to.crawldb</name> > > <value>"_rs_"</value> > > <description>Comma-separated list of parse metadata keys to transfer to > > the crawldb (NUTCH-779). > > Assuming for instance that the languageidentifier plugin is enabled, > > setting the value to 'lang' > > will copy both the key 'lang' and its value to the corresponding entry > > in the crawldb. > > </description> > > </property>ç > ****************************** > this the end of the message. > Text below is added automatically by my email provider. > ******************************** > La @universidad_uci es Fidel. Los jóvenes no fallaremos. > #HastaSiempreComandante > #HastalaVictoriaSiempre > >

