K. Thank you.  Will check.

On Wed, Feb 10, 2021 at 2:23 PM Peter Kronenberg <[email protected]>
wrote:

> I have also noticed since yesterday that there are files in my temp
> directory that aren’t being cleaned up.  All of these files contain the
> output of Tesseract
>
>
>
>
>
> *From:* Peter Kronenberg
> *Sent:* Wednesday, February 10, 2021 12:35 PM
> *To:* [email protected]
> *Subject:* Error calling ImageMagick
>
>
>
> I think yesterday’s code introduced a bug.  The temporary file that is
> created for ImageMagick is not there.
>
>
>
>
>
> [main] INFO org.apache.tika.parser.ocr.TesseractOCRParser - Tesseract is
> installed and is being invoked. This can add greatly to processing time.
> If you do not want tesseract to be applied to your files see:
> https://cwiki.apache.org/confluence/display/TIKA/TikaOCR#TikaOCR-disable-ocr
>
> magick: no images found for operation `-resize' at CLI arg 9 @
> error/operation.c/CLIOption/5361.
>
> [main] WARN org.apache.tika.parser.ocr.TesseractOCRParser - ImageMagick
> failed (commandline: [magick, -density, 300, -depth, 4, -colorspace, gray,
> -filter, triangle, -resize, 200%,
> C:\Users\PETERK~1\AppData\Local\Temp\apache-tika-3889844060604687745.tmp,
> C:\Users\PETERK~1\AppData\Local\Temp\apache-tika-3889844060604687745.tmp])
>
> org.apache.commons.exec.ExecuteException: Process exited with an error: 1
> (Exit value: 1)
>
>             at
> org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
>
>             at
> org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166)
>
>             at
> org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:153)
>
>             at
> org.apache.tika.parser.ocr.ImagePreprocessor.process(ImagePreprocessor.java:121)
>
>             at
> org.apache.tika.parser.ocr.TesseractOCRParser.parse(TesseractOCRParser.java:280)
>
>             at
> org.apache.tika.parser.ocr.TesseractOCRParser.parse(TesseractOCRParser.java:248)
>
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:277)
>
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:277)
>
>             at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>
>             at
> org.apache.tika.parser.image.AbstractImageParser.parse(AbstractImageParser.java:94)
>
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:277)
>
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:277)
>
>             at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>
>             at org.torchai.ImageMagick.parse(ImageMagick.java:43)
>
>             at org.torchai.ImageMagick.main(ImageMagick.java:56)
>
> Text: MARLEY was dead, to begin with. There is no doubt whatever about
>
> that. The register of his burial was signed by the clergyman, the clerk,
>
> the undertaker, and the chief mourner. Scrooge signed it. And
>
> Scrooge’s name was good upon ’Change, for anything he chose to put
>
> his hand to.
>
>
>
>
>
> Here’s the code:
>
>
>
> *public static *String parse(String file) *throws *TikaException,
> SAXException, IOException {
>
>     *final *AutoDetectParser parser = *new *AutoDetectParser(*new *
> TikaConfig());
>
>     *final *ParseContext parseContext = *new *ParseContext();
>
>     *final *TesseractOCRConfig tessConfig = *new *TesseractOCRConfig();
>     parseContext.set(AutoDetectParser.*class*, parser);
>     parseContext.set(TesseractOCRConfig.*class*, tessConfig);
>
>     tessConfig.setEnableImageProcessing(*true*);
>
>     ContentHandler contentHandler = *new *BodyContentHandler();
>
>     Metadata metadata = *new *Metadata();
>
>
>     *try *(TikaInputStream stream = TikaInputStream.*get*(*new *
> BufferedInputStream(*new *FileInputStream(file)))) {
>         parser.parse(stream, contentHandler, metadata, parseContext);
>     }
>
>     *return *contentHandler.toString();
> }
>
>
>

Reply via email to