Works on linux for me.  Let me break out my windows laptop.

On Wed, Feb 10, 2021 at 12:34 PM Peter Kronenberg
<[email protected]> wrote:
>
> I think yesterday’s code introduced a bug.  The temporary file that is 
> created for ImageMagick is not there.
>
>
>
>
>
> [main] INFO org.apache.tika.parser.ocr.TesseractOCRParser - Tesseract is 
> installed and is being invoked. This can add greatly to processing time.  If 
> you do not want tesseract to be applied to your files see: 
> https://cwiki.apache.org/confluence/display/TIKA/TikaOCR#TikaOCR-disable-ocr
>
> magick: no images found for operation `-resize' at CLI arg 9 @ 
> error/operation.c/CLIOption/5361.
>
> [main] WARN org.apache.tika.parser.ocr.TesseractOCRParser - ImageMagick 
> failed (commandline: [magick, -density, 300, -depth, 4, -colorspace, gray, 
> -filter, triangle, -resize, 200%, 
> C:\Users\PETERK~1\AppData\Local\Temp\apache-tika-3889844060604687745.tmp, 
> C:\Users\PETERK~1\AppData\Local\Temp\apache-tika-3889844060604687745.tmp])
>
> org.apache.commons.exec.ExecuteException: Process exited with an error: 1 
> (Exit value: 1)
>
>             at 
> org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
>
>             at 
> org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166)
>
>             at 
> org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:153)
>
>             at 
> org.apache.tika.parser.ocr.ImagePreprocessor.process(ImagePreprocessor.java:121)
>
>             at 
> org.apache.tika.parser.ocr.TesseractOCRParser.parse(TesseractOCRParser.java:280)
>
>             at 
> org.apache.tika.parser.ocr.TesseractOCRParser.parse(TesseractOCRParser.java:248)
>
>             at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:277)
>
>             at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:277)
>
>             at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>
>             at 
> org.apache.tika.parser.image.AbstractImageParser.parse(AbstractImageParser.java:94)
>
>             at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:277)
>
>             at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:277)
>
>             at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>
>             at org.torchai.ImageMagick.parse(ImageMagick.java:43)
>
>             at org.torchai.ImageMagick.main(ImageMagick.java:56)
>
> Text: MARLEY was dead, to begin with. There is no doubt whatever about
>
> that. The register of his burial was signed by the clergyman, the clerk,
>
> the undertaker, and the chief mourner. Scrooge signed it. And
>
> Scrooge’s name was good upon ’Change, for anything he chose to put
>
> his hand to.
>
>
>
>
>
> Here’s the code:
>
>
>
> public static String parse(String file) throws TikaException, SAXException, 
> IOException {
>
>     final AutoDetectParser parser = new AutoDetectParser(new TikaConfig());
>
>     final ParseContext parseContext = new ParseContext();
>
>     final TesseractOCRConfig tessConfig = new TesseractOCRConfig();
>     parseContext.set(AutoDetectParser.class, parser);
>     parseContext.set(TesseractOCRConfig.class, tessConfig);
>
>     tessConfig.setEnableImageProcessing(true);
>
>     ContentHandler contentHandler = new BodyContentHandler();
>
>     Metadata metadata = new Metadata();
>
>
>     try (TikaInputStream stream = TikaInputStream.get(new 
> BufferedInputStream(new FileInputStream(file)))) {
>         parser.parse(stream, contentHandler, metadata, parseContext);
>     }
>
>     return contentHandler.toString();
> }
>
>

Reply via email to