https://bz.apache.org/bugzilla/show_bug.cgi?id=61266

Javen O'Neal <one...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|major                       |enhancement

--- Comment #5 from Javen O'Neal <one...@apache.org> ---
Looks like POI doesn't currently support reading this file format.

Opening the binary file in a text editor reveals that most of the document
contents are saved as ASCII, with a few special characters to embed figures and
designate the start of sections. This doesn't look like any OLE2 file I have
seen before.

Presumably if all that is needed is text extraction, you could use `strings` on
this document.

Changing this to an enhancement request in case someone is interested in
figuring out what archaic file format this is and writing a primitive parser
that can extract text from the document.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org

Reply via email to