Konstantin Gribov created TIKA-1574:
---------------------------------------
Summary: Frames in header/footer in doc files aren't extracted
Key: TIKA-1574
URL: https://issues.apache.org/jira/browse/TIKA-1574
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.7
Environment: linux, openjdk7/openjdk8
Reporter: Konstantin Gribov
Assignee: Konstantin Gribov
Text from frames in header/footer are omitted in WordParser. Text from frames
in document body are extracted fine.
Same document converted to docx is extracted fully.
Maybe, it's upstream bug, I'll dig into it and file a ticket to poi bugtracker
if it's the case.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)