https://bz.apache.org/bugzilla/show_bug.cgi?id=60942
Bug ID: 60942
Summary: Avoid unicode check in Word 6.0 docs
Product: POI
Version: 3.16-dev
Hardware: PC
Status: NEW
Severity: normal
Priority: P2
Component: HWPF
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: ---
This is a half step towards 60936.
"On TIKA-2313, Steven Hall submitted an example Word 6.0 file whose extracted
text is garbage."
>From what I can tell 6.0 didn't use Unicode. Until we can figure out how the
codepage was specified in 6.0, we should at least turn off the Unicode check.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]