Hi,

It looks like Tika does not include a PostScript parser. At least the copy that 
comes with Nutch 1.1. Is this right? I just want to double check because 
PostScript is a major file format. I get errors "Can't retrieve Tika parser for 
mime-type application/postscript" in the log when Nutch comes across a 
PostScript file. I've found a reference to parser-pdf associated with 
PostScript, but it does not work any better. It tries to treat PostScript files 
as pdf and fails, if I correctly interpret its complains.

Could anyone help with parsing PostScript in Nutch, please? It is hard to 
believe that this is not implemented.

Thanks,

Arkadi

Reply via email to