[
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14280433#comment-14280433
]
Tim Allison commented on TIKA-1511:
-----------------------------------
Hmmmm... This will fail if someone sends in a custom EmbeddedDocumentExtractor
because there is no way to pass the StatementTablePair to that interface via
ParseContext.
Some options:
1) We could go back to treating the db as one big doc, as we do with xls, but I
think I'd prefer to treat each table as a separate doc.
2) We could get rid of the StatementTablePair hack, extract the text from each
table into a String and then pass that into EmbeddedDocumentExtractor as the
InputStream. The drawback to this is that we'd ignore the handler and lose
potential <tr> <td> markup....
Any ideas on this?
> Create a parser for SQLite3
> ---------------------------
>
> Key: TIKA-1511
> URL: https://issues.apache.org/jira/browse/TIKA-1511
> Project: Tika
> Issue Type: New Feature
> Components: parser
> Affects Versions: 1.6
> Reporter: Luis Filipe Nassif
> Fix For: 1.8
>
> Attachments: TIKA-1511v1.patch, testSQLLite3b.db
>
>
> I think it would be very useful, as sqlite is used as data storage by a wide
> range of applications. Opening the ticket to track it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)