>
> I don't know if I understand completely your email.
> What you mean with "cache"?

So if you go with the standard search results page, there is a link to
a cached copy of the page.  If the page was html, then there are no
problems, however, if the page was binary, it returns a http 500
internal server error.

You can see this if you click on the "cached" link of any of the pdf
documents in the search results on my search engine:
http://ldssearch.com/search.jsp?lang=en&query=pdf


>
> steven shingler escribió:
> > Hi all,
> >
> > I'm trying to find out which filetypes nutch will cache.
> >
> > for example: it does html, but not pdf.
> >
> > Is there any documentation on how different filetypes are handled?
> >
> > Is it possible to configure nutch to cache pdfs etc?
> >
> > Any advice very gratefully received.
> > Thanks,
> > Steve
> >
> > ------------------------------------------------------------------------
> >
> > No virus found in this incoming message.
> > Checked by AVG Free Edition.
> > Version: 7.1.405 / Virus Database: 268.12.3/445 - Release Date: 11/09/2006
> >
>
>
>
>
> __________________________________________________
> Preguntá. Respondé. Descubrí.
> Todo lo que querías saber, y lo que ni imaginabas,
> está en Yahoo! Respuestas (Beta).
> ¡Probalo ya!
> http://www.yahoo.com.ar/respuestas
>
>
>


-- 
http://JacobBrunson.com

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to