[jira] [Created] (TIKA-1574) Frames in header/footer in doc files aren't extracted

Konstantin Gribov (JIRA) Wed, 11 Mar 2015 11:20:10 -0700

Konstantin Gribov created TIKA-1574:
---------------------------------------


             Summary: Frames in header/footer in doc files aren't extracted
                 Key: TIKA-1574
                 URL: https://issues.apache.org/jira/browse/TIKA-1574
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.7
         Environment: linux, openjdk7/openjdk8
            Reporter: Konstantin Gribov
            Assignee: Konstantin Gribov


Text from frames in header/footer are omitted in WordParser. Text from frames 
in document body are extracted fine.
Same document converted to docx is extracted fully.

Maybe, it's upstream bug, I'll dig into it and file a ticket to poi bugtracker 
if it's the case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (TIKA-1574) Frames in header/footer in doc files aren't extracted

Reply via email to