[
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-1511:
------------------------------
Attachment: TIKA-1511v1.patch
testSQLLite3b.db
First draft of patch attached. Need to build out tests, obviously, and I'll
fix spelling of SQLLite in the class names! :)
For the design, I had to create a public parser that called a new *DBParser
class for each call to parse (like many other parsers) to avoid thread safety
issues.
The *DBParser, in turn, calls the EmbeddedDocumentParser for each table, and it
specifies via special mime-type, which *TableParser will be called.
The *TableParser ignores the InputStream, and grabs the StatementTablePair from
the ParseContext to parse each table.
The jdbc wrapper around sqlite is not able to read CLOBs (apparently?),
although I could write them without exception (doesn't mean they were actually
written), and it does some other stuff that is not standard JDBC, but that is
all handled in SQLiteTableParser, a subclass of AbstractTableParser.
Any and all feedback is welcomed. This is still drafty.
> Create a parser for SQLite3
> ---------------------------
>
> Key: TIKA-1511
> URL: https://issues.apache.org/jira/browse/TIKA-1511
> Project: Tika
> Issue Type: New Feature
> Components: parser
> Affects Versions: 1.6
> Reporter: Luis Filipe Nassif
> Fix For: 1.8
>
> Attachments: TIKA-1511v1.patch, testSQLLite3b.db
>
>
> I think it would be very useful, as sqlite is used as data storage by a wide
> range of applications. Opening the ticket to track it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)