Note that you can also test the parsing using : bin/nutch org.apache.nutch.parse.ParserChecker http://www.egamaster.com/datos/politica_fr.pdf
On 26 November 2010 11:35, Saphira <[email protected]> wrote: > > I'll try to call directly from tika, but if you say that it gets parsed > ok... > may be the problem the integration of tika with nutch? > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Unable-to-extract-PDF-content-tp1971600p1972270.html > Sent from the Nutch - User mailing list archive at Nabble.com. > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com

