[ 
https://issues.apache.org/jira/browse/TIKA-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528313#comment-17528313
 ] 

Luís Filipe Nassif edited comment on TIKA-3738 at 4/26/22 5:23 PM:
-------------------------------------------------------------------

In our project, we workarounded this by patching ForkParser to write all 
metadata to content handler at the end of parsing, and not at the beginning. It 
is ok to our needs, since we use metadata after parsing ends, but not sure if 
it is ok for others.


was (Author: lfcnassif):
In our project, we workarounded this by patching ForkParser to write all 
metadata to content handler at the end of parsing, and not at the beginning. It 
is ok for our needs, since we use metadata after parsing ends, but not sure if 
it is ok for others.

> ForkParser missing metadata for some document formats
> -----------------------------------------------------
>
>                 Key: TIKA-3738
>                 URL: https://issues.apache.org/jira/browse/TIKA-3738
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 2.3.0
>         Environment: Java 11.0.14.
>            Reporter: Stephen H
>            Priority: Major
>         Attachments: ForkParserIntegrationTest.java.diff, 
> testVideoMetadataMp4.mp4
>
>
> When using ForkParser, metadata from some parsers is not being returned in 
> the Metadata object or in the head of the returned XML. These include 
> OpenDocument Presentation (ODP), OpenDocument Spreadsheet (ODS), Microsoft 
> Word 2006 XML, MP4 Audio (M4A) and MP4 Video (MP4).
> Patch for ForkParserIntegrationTest showing the issue for these file types is 
> attached, along with an MP4 video file containing metadata as there doesn't 
> appear to be one currently in the test set.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to