[
https://issues.apache.org/jira/browse/TIKA-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18058274#comment-18058274
]
Tim Allison commented on TIKA-4657:
-----------------------------------
Thank you for noticing, opening the issue and supplying an example file!
> Endnote content in tables omitted from .docx text
> -------------------------------------------------
>
> Key: TIKA-4657
> URL: https://issues.apache.org/jira/browse/TIKA-4657
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 3.2.3
> Reporter: Klara Mazurak
> Priority: Major
> Fix For: 4.0.0, 3.3.0
>
> Attachments: with_table_-_endnote_content_omitted.docx,
> without_table_-_endnotes_work_correctly.docx
>
>
> If an endnote in a .docx file contains text in a table, that text is omitted
> from Tika's text extraction.
> See the two attached files: the one without a table yields all the text as
> expected, the one with the table does not.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)