Denis Kildishev created TIKA-1144:
-------------------------------------
Summary: Changes in styling mechanism, inner table support and
list support for Word Extractor
Key: TIKA-1144
URL: https://issues.apache.org/jira/browse/TIKA-1144
Project: Tika
Issue Type: Improvement
Components: parser
Reporter: Denis Kildishev
Priority: Minor
Current version of Poi mechanisms can be used to support different kinds of
styling and list handling. For current moment, Tika supports for styling of
separate Character Runs, but this approach is not ideal and can lead to visual
glitches in a form of pseudo spaces.
Another option is lists. Information about them already can be obtained from
poi representation, but this mechanism is not used in current version of Word
Extractor.
One of options that also can be solved now, is the problem of inner tables. It
is not clearly related to two problems before, but the solution of this problem
is based on the same mechanism as solution for previously listed problems. As
an example of wrong handling can be file with table that includes another table
in the first cell.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira