Well, it's a *very* basic implementation so expect it to break (it worked for me on a few simple books)
I haven't used Tika before, but I'm very interested in it, so I just copied some of the ODF parts to get started and learn how Tika works. I'll take a look at the zip / xml parsers to improve it. Best regards, Bart ________________________________________ From: Jukka Zitting (JIRA) [j...@apache.org] Sent: Friday, October 09, 2009 11:46 AM To: tika-dev@lucene.apache.org Subject: [bulk] [jira] Commented: (TIKA-302) patch: initial support for ePUB [ https://issues.apache.org/jira/browse/TIKA-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763942#action_12763942 ] Jukka Zitting commented on TIKA-302: ------------------------------------ Excellent, thanks! I just got a Sony e-reader, so I'm very much interested in this feature. I'll review the patch in more detail in a few days. As a general comment, it looks like the new parser could (should?) better leverage our existing zip and html/xml parsers. > patch: initial support for ePUB > -------------------------------- > > Key: TIKA-302 > URL: https://issues.apache.org/jira/browse/TIKA-302 > Project: Tika > Issue Type: New Feature > Components: parser > Affects Versions: 0.4 > Reporter: Bart Hanssens > Attachments: initial-epub-support.patch > > > Initial support for ePUB e-books -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.