I don't know if I understand completely your email.
What you mean with "cache"?

So if you go with the standard search results page, there is a link to
a cached copy of the page.  If the page was html, then there are no
problems, however, if the page was binary, it returns a http 500
internal server error.

You can see this if you click on the "cached" link of any of the pdf
documents in the search results on my search engine:
http://ldssearch.com/search.jsp?lang=en&query=pdf



steven shingler escribió:
> Hi all,
>
> I'm trying to find out which filetypes nutch will cache.
>
> for example: it does html, but not pdf.
>
> Is there any documentation on how different filetypes are handled?
>
> Is it possible to configure nutch to cache pdfs etc?
>
> Any advice very gratefully received.
> Thanks,
> Steve
>
> ------------------------------------------------------------------------
>
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.405 / Virus Database: 268.12.3/445 - Release Date: 11/09/2006
>




__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas





--
http://JacobBrunson.com

Reply via email to