Hi, I am trying to implement a plugin of indexing and parsing for specific purpose. I need to get the last-modified http field of the html documents, to have an estimation of the publishing date of the documents. If I try with parse.getData().getContentMeta().get(org.apache.nutch.metadata.HttpHeaders.LAST_MODIFIED) it returns null because the last-modified http information is not stored in the metadata. Does anyone know how to get it?, do I need to change the fetcher?, any advice will be very useful.
Thanks in advance. Javier.
