[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14747292#comment-14747292
]
Tim Allison commented on TIKA-1731:
-----------------------------------
Please don't stop watching. We can use your help! Many thanks for your
contributions so far.
Once we do the integration, it would be helpful to have a document that tests
various components (headers, footers, footnotes, tables, text boxes, embedded
documents, table of contents...to name a few) for each of the document formats.
Or, at the least, if you could run the integration (once it is completed)
against a batch of docs and let us know what you find, that would be helpful.
Thank you, again!
> Try to integrate java-hwp into Tika
> -----------------------------------
>
> Key: TIKA-1731
> URL: https://issues.apache.org/jira/browse/TIKA-1731
> Project: Tika
> Issue Type: New Feature
> Reporter: Tim Allison
> Priority: Minor
>
> Now that we have detection working for hwp files, it would be great to add a
> parser.
> [java-hwp|https://github.com/ddoleye/java-hwp] looks like a promising
> candidate. We'd need to ask ddoleye about a potential change in license and
> then interest in maintenance + pushing to maven.
> Any other candidates?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)