[ 
https://issues.apache.org/jira/browse/TIKA-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145760#comment-16145760
 ] 

Tim Allison commented on TIKA-2448:
-----------------------------------

Tracking potential fixes in POI's DOM model: 
https://bz.apache.org/bugzilla/show_bug.cgi?id=61470

> Handle phonetic strings in the SAX docx parser
> ----------------------------------------------
>
>                 Key: TIKA-2448
>                 URL: https://issues.apache.org/jira/browse/TIKA-2448
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>              Labels: sax_docx_fixes
>         Attachments: testWORD_phonetic.docx
>
>
> On TIKA-2440, [~Takahiro] requested the ability to turn off extraction of 
> phonetic runs.  We should enable this for docx, too.  We'll have to make 
> fixes in POI for our DOM docx parser, but it should be fairly straighforward 
> in our SAX docx parser.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to