[
https://issues.apache.org/jira/browse/TIKA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14737990#comment-14737990
]
mungeol heo edited comment on TIKA-1731 at 9/10/15 1:52 AM:
------------------------------------------------------------
{quote}did hwp ever go the ooxml route after its OLE phase{quote}
After a little search, I think it did.
{quote}does it diverge from standard ooxml at all{quote}
It supports microsoft OOXML(office open XML).
You can load OOXML document or store as OOXML format from HWP editor. (I am not
sure whether this information helps)
For instance loading ms-doc file or store as ms-doc file.
{quote}can Tika+POI as they are handle it{quote}
I think so? since the author of java-hwp says he used apache POI's POIFS file
system for handling compound file of HWP 5.0.
was (Author: mungeol):
{quote}did hwp ever go the ooxml route after its OLE phase{quote}
After a little search, I think it did.
{quote}does it diverge from standard ooxml at all{quote}
It supports microsoft OOXML(office open XML).
You can load OOXML document or store as OOXML format from HWP editor. (I am not
sure whether this information helps)
For instance loading ms-doc file or store as ms-doc file.
{quote}can Tika+POI as they are handle it{quote}
I think so(?) since the author of java-hwp says he used apache POI's POIFS file
system for handling compound file of HWP 5.0.
> Try to integrate java-hwp into Tika
> -----------------------------------
>
> Key: TIKA-1731
> URL: https://issues.apache.org/jira/browse/TIKA-1731
> Project: Tika
> Issue Type: New Feature
> Reporter: Tim Allison
> Priority: Minor
>
> Now that we have detection working for hwp files, it would be great to add a
> parser.
> [java-hwp|https://github.com/ddoleye/java-hwp] looks like a promising
> candidate. We'd need to ask ddoleye about a potential change in license and
> then interest in maintenance + pushing to maven.
> Any other candidates?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)