[
https://issues.apache.org/jira/browse/TIKA-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222927#comment-14222927
]
Milan Zivkovic commented on TIKA-1473:
--------------------------------------
Hi,
Indeed I was using the FileInputStream, but if I wrap it with the
TikaInputStream I get the same problem.
{code}
public static void main( final String[] args ) throws IOException,
TikaException {
final String path = "path_to_file";
final Metadata metadata = new Metadata();
InputStream is = TikaInputStream.get( Files.newInputStream( Paths.get(
path ) ) );
is = TikaInputStream.get( is );
final String someText = TIKA.parseToString( is, metadata,
MAX_CONTENT_LENGTH );
System.out.println( someText );
}
{code}
> Apache Tika is not working for .docx documents
> -----------------------------------------------
>
> Key: TIKA-1473
> URL: https://issues.apache.org/jira/browse/TIKA-1473
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.5, 1.6
> Reporter: Franco Catto
> Priority: Blocker
>
> I am using Apache Tika 1.6 to read different document files.
> It is reading pdf and old format doc files but when I try to read docx file,
> it gives me following exception:
> org.apache.tika.exception.TikaException: Failed to close temporary resources
> at org.apache.tika.io.TemporaryResources.dispose(TemporaryResources.java:152)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:127)
> ...
> The resource can not be closed because it is still being used by the Java
> Process, certainly the OOXML parser.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)