[Bug 61470] Text with phonetic runs aren't extracted in docx

bugzilla Thu, 31 Aug 2017 12:40:18 -0700

https://bz.apache.org/bugzilla/show_bug.cgi?id=61470


--- Comment #4 from Tim Allison <[email protected]> ---
Given this example:
162.242.228.174/docs/commoncrawl2/WI/WIFC2FI3QH64A6KHOBEDNQKLN5O5EYSS

I wonder if we should cache the phonetic content as we read through the
document and then dump it at the end.  This would allow for a document to be
found via the phonetic info, and it wouldn't completely wreck nlp applications.

For another issue...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[Bug 61470] Text with phonetic runs aren't extracted in docx

Reply via email to