Further improvements to Word .doc and .docx parsing
---------------------------------------------------
Key: TIKA-552
URL: https://issues.apache.org/jira/browse/TIKA-552
Project: Tika
Issue Type: Improvement
Components: parser
Affects Versions: 0.8
Reporter: Nick Burch
Assignee: Nick Burch
Fix For: 0.9
This is a follow-on to TIKA-506, to track the enhancements to .doc and .docx
parsing between 0.8 and 0.9
The list includes:
* Anchors and bookmarks
* Floating word .doc pictures (\u0008 rather than \u0001)
* Nested word .doc tables
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.