Hello Jon, and sorry for the late response,

> I'd appreciate any thoughts. Perhaps something for parser policy. I've
> > traced the source code a bit and nothing jumped out at me...

There's some currently identified issues on the parser policy (ie
ParserFactory), and we are actively working on it.
I don't undestand why the parse-ext plugin is called in your case, whereas
it should be parser-pdf or parse-html plugins.
Here's a workaround: if you don't have needs for the parse-ext (plugin used
to perform parsing using some exernal commands), simply remove it and all
should be ok.
Could you please send me your /usr/local/nutch/plugins/parse-ext/plugin.xml
file so that I can check if something goes wrong in it.

Regards

Jérôme

--
http://motrech.free.fr/
http://www.frutch.org/

Reply via email to