Thank you for your reply. I will try to write a customized parser for HWP file. And if my code is "pretty enough", I will consider to contribute it. Again, thank you.
On Tue, Sep 1, 2015 at 7:58 PM, Nick Burch <[email protected]> wrote: > On Tue, 1 Sep 2015, Mungeol Heo wrote: >>> >>> java -jar tika-app-1.10.jar --list-supported-types | grep hwp >>> application/x-hwp > > > That means the mime type has been defined in some way > >>> java -jar tika-app-1.10.jar --detect sample.hwp >>> application/x-tika-msoffice > > > That means that the HWP file is based on the OLE2 file format, but that > no-one has told Tika about that, so detection isn't working properly. If you > could create a new bug in JIRA for this, and upload a very small HWP file > (ideally just a few KB), we can get that fixed > >> And another thing is, there is no 'application/x-hwp' in the supported >> formats list which are mentioned at >> 'http://tika.apache.org/1.10/formats.html' page. > > > That means there is no parser available for HWP, and you'd need to write + > contribute one > >> So, does tika support "HWP"? > > > Depends on your definition of "supports"! > > Nick
