https://bz.apache.org/bugzilla/show_bug.cgi?id=60685
Javen O'Neal <one...@apache.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|POI Overall |HPBF --- Comment #3 from Javen O'Neal <one...@apache.org> --- The Microsoft Publisher binary .pub format is undocumented, as indicated here: https://poi.apache.org/hpbf/index.html OpenOffice/LibreOffice doesn't have documentation or an open source application that reads this .pub format, to my knowledge, so that means we'd have to resort to figuring out the format through lots of hard work. Assuming the file you have provided is valid (opens without warnings or errors in Microsoft Publisher), if you're mostly interested in text extraction, then skipping over this hyperlink is probably preferable over throwing an exception. We can log the error that we catch and move forward with extraction. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org