https://bz.apache.org/bugzilla/show_bug.cgi?id=57847

Nick Burch <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #2 from Nick Burch <[email protected]> ---
For slightly complicated reasons, we have two different .doc -> .html
converters, one in the POI codebase (WordToHtmlConverter) and one in the Tika
codebase (org.apache.tika.parser.microsoft.WordExtractor)

If you could, it'd be great if you could try your same file with Apache Tika,
and see if that manages to get the lists out. (Grab the tika-app jar and run it
with --html for a quick way to check)

If Apache Tika does it right, we can hopefully bring over the logic to the
AbstractWordConverter family of converters. If not, we can look to fix it in
both at the same time!

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to