[
https://issues.apache.org/jira/browse/TIKA-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147562#comment-16147562
]
Tim Allison commented on TIKA-2448:
-----------------------------------
POI 61470 was just fixed so our DOM model will be able to handle this when we
upgrade POI > 3.17.
> Handle phonetic strings in the SAX docx parser
> ----------------------------------------------
>
> Key: TIKA-2448
> URL: https://issues.apache.org/jira/browse/TIKA-2448
> Project: Tika
> Issue Type: Improvement
> Reporter: Tim Allison
> Priority: Minor
> Labels: sax_docx_fixes
> Attachments: testWORD_phonetic.docx
>
>
> On TIKA-2440, [~Takahiro] requested the ability to turn off extraction of
> phonetic runs. We should enable this for docx, too. We'll have to make
> fixes in POI for our DOM docx parser, but it should be fairly straighforward
> in our SAX docx parser.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)