[ 
https://issues.apache.org/jira/browse/TIKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Kildishev updated TIKA-1144:
----------------------------------

    Attachment: word_style.patch

Add better version
                
> Changes in styling mechanism, inner table support and list support for Word 
> Extractor
> -------------------------------------------------------------------------------------
>
>                 Key: TIKA-1144
>                 URL: https://issues.apache.org/jira/browse/TIKA-1144
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>            Reporter: Denis Kildishev
>            Priority: Minor
>         Attachments: word_style.patch
>
>
> Current version of Poi mechanisms can be used to support different kinds of 
> styling and list handling. For current moment, Tika supports for styling of 
> separate Character Runs, but this approach is not ideal and can lead to 
> visual glitches in a form of pseudo spaces. 
> Another option is lists. Information about them already can be obtained from 
> poi representation, but this mechanism is not used in current version of Word 
> Extractor.
> One of options that also can be solved now, is the problem of inner tables. 
> It is not clearly related to two problems before, but the solution of this 
> problem is based on the same mechanism as solution for previously listed 
> problems. As an example of wrong handling can be file with table that 
> includes another table in the first cell. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to