[
https://issues.apache.org/jira/browse/TIKA-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542477#comment-13542477
]
Peter Nordquist commented on TIKA-1040:
---------------------------------------
I'm also running into this issue on Windows 7 with Java version 1.6.0_37. I
did a little digging and I think it may be related to the fact that the
mp4parser library is using memory mapping with the FileChannels which is a
known issue with Java
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4715154 (the mapped region
needs to be garbage collected before the file can be deleted on Windows).
At org.apache.tika.parser.mp4.MP4Parser.parse(MP4Parser.java:117) there's a
call to grab the FileChannel and when using streams this will output the stream
to a file in the temp directory and use the FileChannel from a stream on that
file. Then the channel is used in the constructor for IsoFile this class
parses the file and uses the ChannelHelper class. The ChannelHelper class uses
mapped memory on line 33. The stack starting in the parser is below (opposite
of an exception stacktrace). An important thing to note is that the IsoFile
class uses the PropertyBoxParserImpl class by default which has the
AbstractBoxParser as its super class and the PropertyBoxParserImpl does not
override the parseBox method so I ended up omitting it from the stacktrace.
There are also a couple other locations doing the same memory mapping in the
mp4parser library not listed here but this is the shining example.
org.apache.tika.parser.mp4.MP4Parser.parse(MP4Parser.java:117)
com.coremedia.iso.IsoFile.<init>(IsoFile.java:49)
com.coremedia.iso.IsoFile.parse(IsoFile.java:80)
com.coremedia.iso.AbstractBoxParser.parseBox(AbstractBoxParser.java:50)
com.coremedia.iso.ChannelHelper.readFully(ChannelHelper.java:33)
There should be a workaround by using
org.apache.tika.io.TikaInputStream.get(java.io.File) as your input stream
passed to parsers (you should be able to use either method that takes a file).
This would use the pre-existing TikaInputStream for the stream at
org.apache.tika.parser.mp4.MP4Parser.parse(MP4Parser.java:115) and it should
use the existing file as the source for the FileChannels. However I would like
to use the InputStreams and that's where the TemporaryResources is trying to
delete the temporary file. For now I guess I'm stuck with writing the content
myself and using the workaround above.
> Could not delete temporary file
> -------------------------------
>
> Key: TIKA-1040
> URL: https://issues.apache.org/jira/browse/TIKA-1040
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.2
> Environment: Windows XP 64
> Reporter: Carlos S. Zamudio
>
> Although I found an entry that suggested this had been resolved in 1.2, I
> continue to receive the exception below when attempting to extract metadata
> from a video file. In my case the file type is in the Quicktime MOV format.
> org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from
> org.apache.tika.parser.mp4.MP4Parser@4413ee
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:248)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at test.TikaExamples_1pt2a.testTikaMetadata(TikaExamples_1pt2a.java:223)
> at test.TikaExamples_1pt2a.main(TikaExamples_1pt2a.java:60)
> Caused by: java.io.IOException: Could not delete temporary file
> C:\DOCUME~1\CARLOS~1.SLA\LOCALS~1\Temp\apache-tika-1430602345143256975.tmp
> at
> org.apache.tika.io.TemporaryResources$1.close(TemporaryResources.java:70)
> at
> org.apache.tika.io.TemporaryResources.close(TemporaryResources.java:121)
> at org.apache.tika.io.TikaInputStream.close(TikaInputStream.java:637)
> at org.apache.tika.parser.mp4.MP4Parser.parse(MP4Parser.java:119)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> ... 4 more
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira