[ 
https://issues.apache.org/jira/browse/TIKA-2908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888076#comment-16888076
 ] 

Tim Allison commented on TIKA-2908:
-----------------------------------

>could you explain how reordering the closing of the streams fixed this?

tmp holds the pointer to the tmp file and tries to delete the file when you 
dispose it.  In the earlier code, that was called before the fileoutputstream 
was closed on the tmp file and before the fileinputstream on that file was 
closed.   I, frankly, have no idea how that worked before... :P

> TikaException: Failed to close temporary resource - how to fix?
> ---------------------------------------------------------------
>
>                 Key: TIKA-2908
>                 URL: https://issues.apache.org/jira/browse/TIKA-2908
>             Project: Tika
>          Issue Type: Bug
>          Components: ocr, parser
>    Affects Versions: 1.21
>            Reporter: Marichi Gupta
>            Priority: Blocker
>              Labels: ocr, tesseract, tika
>
> I am using Apache Tika on Windows 10, jre 1.8.0_181, and I've imported Tika 
> using Maven with the following dependencies:
> {{<dependencies> <dependency> <groupId>junit</groupId> 
> <artifactId>junit</artifactId> <version>3.8.1</version> <scope>test</scope> 
> </dependency> <dependency> <groupId>org.apache.tika</groupId> 
> <artifactId>tika-parsers</artifactId> <version>1.21</version> </dependency> 
> </dependencies>}}
> I have the code below for performing OCR using Tesseract (which I have 
> independently tested and know to be working):
> public static void OCRTest() {
> try { 
> BufferedImage im = ImageIO.read(new File(OCR_IMAGE)); 
> {{TesseractOCRConfig config = new TesseractOCRConfig();}}
> config.setTessdataPath("C:
> Program Files\\Tesseract-OCR\tessdata");
> config.setTesseractPath("C:
> Program Files
> Tesseract-OCR"); 
> {{ParseContext parseContext = new ParseContext();}}
> parseContext.set(TesseractOCRConfig.class, config);
> TesseractOCRParser parser = new TesseractOCRParser();
> BodyContentHandler handler = new BodyContentHandler();
> Metadata metadata = new Metadata();
> try {
> {{parser.parse(im, handler, metadata, parseContext);}}
> System.out.println(handler.toString());
> } catch (SAXException e)\{ e.printStackTrace(); }
> catch (TikaException e) \{ e.printStackTrace(); }
> } catch (IOException e)\{ e.printStackTrace(); }
> }
> I run into the following exception:
> org.apache.tika.exception.TikaException: Failed to close temporary resources 
> at org.apache.tika.io.TemporaryResources.dispose(TemporaryResources.java:174) 
> at 
> org.apache.tika.parser.ocr.TesseractOCRParser.parse(TesseractOCRParser.java:251)
>  at test.test.App.OCRTest(App.java:46) at test.test.App.main(App.java:30) 
> Caused by: java.nio.file.FileSystemException: 
> C:\Users\m\AppData\Local\Temp\apache-tika-2643805894084124300.tmp: The 
> process cannot access the file because it is being used by another process. 
> The tmp file is present in the Temp folder. I have the source code downloaded 
> and have stepped through it with the debugger - the error comes from 
> attempting to close the tmp file. There is another post on this board 
> (https://issues.apache.org/jira/browse/TIKA-1732) where someone else has run 
> into the same exception, although with the AutoDetectParser and not 
> Tesseract. Their issue seemed to be a conflict in their imported jars, but I 
> run into this issue even with only the Apache Tika libraries installed. I 
> have a feeling this is a concurrency issue, but I can't pinpoint the conflict.
> I don't run into this issue when using the Tika's AutoDetectParser, only with 
> the TesseractOCRParser. This is an important part of an application I'm 
> working on, so I would really appreciate any insights on how to proceed.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to