[jira] [Comment Edited] (TIKA-1511) Create a parser for SQLite3

2015-03-29 Thread Konstantin Gribov (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385836#comment-14385836
 ] 

Konstantin Gribov edited comment on TIKA-1511 at 3/29/15 4:05 PM:
--

Idea of better tika-parsers module separation was dicussed some time ago, it's 
also mentioned in Tika 2.0 roadmap 
(https://wiki.apache.org/tika/Tika2_0RoadMap).

In such case, user would get appropriate {{tika-parsers-\*}} modules with their 
deps (e. g., via {{mvn dependency:copy}} or something similar) and Solr can 
depend only on {{tika-core}} and minimal {{tika-parsers-\*}}. Or with 
dependency only on {{tika-core}} but it will lead to statndard questions like 
why it doesn't work as with {{slf4j}} in solr4.


was (Author: grossws):
Idea of better tika-parsers module separation was dicussed some time ago, it's 
also mentioned in Tika 2.0 roadmap 
(https://wiki.apache.org/tika/Tika2_0RoadMap).

In such case, user would get appropriate {{tika-parsers-*}} modules with their 
deps (e. g., via {{mvn dependency:copy}} or something similar) and Solr can 
depend only on {{tika-core}} and minimal {{tika-parsers-*}}. Or with dependency 
only on {{tika-core}} but it will lead to statndard questions like why it 
doesn't work as with {{slf4j}} in solr4.

 Create a parser for SQLite3
 ---

 Key: TIKA-1511
 URL: https://issues.apache.org/jira/browse/TIKA-1511
 Project: Tika
  Issue Type: New Feature
  Components: parser
Affects Versions: 1.6
Reporter: Luis Filipe Nassif
 Fix For: 1.8

 Attachments: TIKA-1511v1.patch, TIKA-1511v2.patch, TIKA-1511v3.patch, 
 TIKA-1511v3bis.patch, testSQLLite3b.db, testSQLLite3b.db


 I think it would be very useful, as sqlite is used as data storage by a wide 
 range of applications. Opening the ticket to track it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1511) Create a parser for SQLite3

2015-02-13 Thread Konstantin Gribov (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320022#comment-14320022
 ] 

Konstantin Gribov edited comment on TIKA-1511 at 2/13/15 12:40 PM:
---

[~talli...@mitre.org], you can also make it {{optionaltrue/optional}} 
instead of {{provided}}.

Also, I can't find parser itself 
({{org.apache.tika.parser.jdbc.SQLite3Parser}}) in trunk rev 1659449.


was (Author: grossws):
[~talli...@mitre.org], you can also make it {{optionaltrue/optional}} 
instead of {{provided}}.

Also, I can't find parser itself 
({{org.apache.tika.parser.jdbc.SQLite3Parser}})in trunk rev 1659449.

 Create a parser for SQLite3
 ---

 Key: TIKA-1511
 URL: https://issues.apache.org/jira/browse/TIKA-1511
 Project: Tika
  Issue Type: New Feature
  Components: parser
Affects Versions: 1.6
Reporter: Luis Filipe Nassif
 Fix For: 1.8

 Attachments: TIKA-1511v1.patch, TIKA-1511v2.patch, TIKA-1511v3.patch, 
 testSQLLite3b.db, testSQLLite3b.db


 I think it would be very useful, as sqlite is used as data storage by a wide 
 range of applications. Opening the ticket to track it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1511) Create a parser for SQLite3

2015-01-30 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298670#comment-14298670
 ] 

Tim Allison edited comment on TIKA-1511 at 1/30/15 2:25 PM:


Thank you, Nick, for reviewing this!  I'll fix the wildcards...not sure how 
those crept in and the assertContains...

I'm not happy with the added complexity of the JDBCInputStream.

Bottom line: should we get rid of that option and back off to a zero-byte 
InputStream and grabbing the table object from the OpenContainer?  That would 
simplify quite a bit, including detection... And, it would make this parser 
behave like the PST parser...I think.  If we really want to add it later, we 
can, but simpler is better...

[~lfcnassif], would you be ok with that proposal?

As for another jdbc-based format, I completely agree.  Can you recommend 
another single-file db format?  Access comes to mind, but I can't find a pure 
Java parser that has jdbc: Jackcess (LGPL) has its own api and doesn't support 
jdbc.  I looked briefly at derby, hsqldb, mysql, and they all seem to rely on a 
directory of files...I very well could have missed a single file option for 
those, though...

Maybe h2 (MPL and EPL [licenses|http://www.h2database.com/html/license.html])?





was (Author: talli...@mitre.org):
Thank you, Nick, for reviewing this!  I'll fix the wildcards...not sure how 
those crept in and the assertContains...

I'm not happy with the added complexity of the JDBCInputStream.

Bottom line: should we get rid of that option and back off to a zero-byte 
InputStream and grabbing the table object from the OpenContainer?  That would 
simplify quite a bit, including detection... And, it would make this parser 
behave like the PST parser...I think.  If we really want to add it later, we 
can, but simpler is better...

[~lfcnassif], would you be ok with that proposal?

As for another jdbc-based format, I completely agree.  Can you recommend 
another single-file db format?  Access comes to mind, but I can't find a pure 
Java parser that has jdbc: Jackcess (LGPL) has its own api and doesn't support 
jdbc.  I looked briefly at derby, hsqldb, mysql, and they all seem to rely on a 
directory of files...I very well could have missed a single file option for 
those, though...





 Create a parser for SQLite3
 ---

 Key: TIKA-1511
 URL: https://issues.apache.org/jira/browse/TIKA-1511
 Project: Tika
  Issue Type: New Feature
  Components: parser
Affects Versions: 1.6
Reporter: Luis Filipe Nassif
 Fix For: 1.8

 Attachments: TIKA-1511v1.patch, TIKA-1511v2.patch, TIKA-1511v3.patch, 
 testSQLLite3b.db, testSQLLite3b.db


 I think it would be very useful, as sqlite is used as data storage by a wide 
 range of applications. Opening the ticket to track it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1511) Create a parser for SQLite3

2015-01-26 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291829#comment-14291829
 ] 

Tim Allison edited comment on TIKA-1511 at 1/26/15 1:52 PM:


The RecursiveParserWrapper should allow that, no?  With the caveat that it 
caches all output in memory...


was (Author: talli...@mitre.org):
The RecursiveParserWrapper should allow, that, no?

 Create a parser for SQLite3
 ---

 Key: TIKA-1511
 URL: https://issues.apache.org/jira/browse/TIKA-1511
 Project: Tika
  Issue Type: New Feature
  Components: parser
Affects Versions: 1.6
Reporter: Luis Filipe Nassif
 Fix For: 1.8

 Attachments: TIKA-1511v1.patch, TIKA-1511v2.patch, TIKA-1511v3.patch, 
 testSQLLite3b.db, testSQLLite3b.db


 I think it would be very useful, as sqlite is used as data storage by a wide 
 range of applications. Opening the ticket to track it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1511) Create a parser for SQLite3

2015-01-26 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291829#comment-14291829
 ] 

Tim Allison edited comment on TIKA-1511 at 1/26/15 2:36 PM:


The RecursiveParserWrapper should allow that, no?  With the caveat that it 
caches all output in memory...  You should be able to parse the output from the 
standard recursive XHTML output as well.  Right?

If you have a chance (and if you haven't done so already), fork branch 1511 
from my github site and take a look at the output of the test cases...throw in 
some print statements and see if that'll work.  For 
testRecursiveParserWrapper(), change 
BasicContentHandlerFactory.HANDLER_TYPE.BODY to 
BasicContentHandlerFactory.HANDLER_TYPE.XML.


was (Author: talli...@mitre.org):
The RecursiveParserWrapper should allow that, no?  With the caveat that it 
caches all output in memory...

 Create a parser for SQLite3
 ---

 Key: TIKA-1511
 URL: https://issues.apache.org/jira/browse/TIKA-1511
 Project: Tika
  Issue Type: New Feature
  Components: parser
Affects Versions: 1.6
Reporter: Luis Filipe Nassif
 Fix For: 1.8

 Attachments: TIKA-1511v1.patch, TIKA-1511v2.patch, TIKA-1511v3.patch, 
 testSQLLite3b.db, testSQLLite3b.db


 I think it would be very useful, as sqlite is used as data storage by a wide 
 range of applications. Opening the ticket to track it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1511) Create a parser for SQLite3

2015-01-21 Thread Luis Filipe Nassif (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285621#comment-14285621
 ] 

Luis Filipe Nassif edited comment on TIKA-1511 at 1/21/15 3:14 PM:
---

No problems, the design looks good!


was (Author: lfcnassif):
No problems, the desing looks good!

 Create a parser for SQLite3
 ---

 Key: TIKA-1511
 URL: https://issues.apache.org/jira/browse/TIKA-1511
 Project: Tika
  Issue Type: New Feature
  Components: parser
Affects Versions: 1.6
Reporter: Luis Filipe Nassif
 Fix For: 1.8

 Attachments: TIKA-1511v1.patch, TIKA-1511v2.patch, TIKA-1511v3.patch, 
 testSQLLite3b.db, testSQLLite3b.db


 I think it would be very useful, as sqlite is used as data storage by a wide 
 range of applications. Opening the ticket to track it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1511) Create a parser for SQLite3

2015-01-19 Thread Luis Filipe Nassif (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14281086#comment-14281086
 ] 

Luis Filipe Nassif edited comment on TIKA-1511 at 1/19/15 12:01 PM:


If the inputStream (pseudoInputStream) received by EmbeddedDocExtractor can not 
be read, I think using EDE is not useful. How will this approach work with 
TikaCli --extract option? My original idea was to support an use case to 
extract each table to one file...

Now I think this extraction of tables to files can be done handling the db as 
one big doc and using a ContentHandlerDecorator that will split the xhtml 
output at table boundaries. Each xhtml segment can be converted to a byte[] (if 
small) and then to a ByteArrayInputStream that can be handled by an 
EmbeddedDocExtractor, if setted into parseContext. If not setted, the 
ContentHandlerDecorator do not need to split the xhtml output and can fallback 
to default behavior. Then A custom EDE can extract tables to files if desired.

So now I think the big doc approah is not bad. What do you think?


was (Author: lfcnassif):
If the inputStream (pseudoInputStream) received by EmbeddedDocExtractor can not 
be read, I think using EDE is not useful. How will this approach work with 
TikaCli --extract option? My original idea was to support an use case like 
TikaCli --extract...

Now I think this extraction of tables to files can be done handling the db as 
one big doc and using a ContentHandlerDecorator that will split the xhtml 
output at table boundaries. Each xhtml segment can be converted to a byte[] (if 
small) and then to a ByteArrayInputStream that can be handled by an 
EmbeddedDocDecorator, if setted into parseContext. If not setted the 
ContentHandlerDecorator do not need to split tables and can fallback to default 
behavior. A custom EDE can then extract tables to files if desired.

So now I think we could go with the big doc approah. What do you think?

 Create a parser for SQLite3
 ---

 Key: TIKA-1511
 URL: https://issues.apache.org/jira/browse/TIKA-1511
 Project: Tika
  Issue Type: New Feature
  Components: parser
Affects Versions: 1.6
Reporter: Luis Filipe Nassif
 Fix For: 1.8

 Attachments: TIKA-1511v1.patch, TIKA-1511v2.patch, testSQLLite3b.db


 I think it would be very useful, as sqlite is used as data storage by a wide 
 range of applications. Opening the ticket to track it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1511) Create a parser for SQLite3

2015-01-18 Thread Luis Filipe Nassif (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14281086#comment-14281086
 ] 

Luis Filipe Nassif edited comment on TIKA-1511 at 1/18/15 2:09 PM:
---

If the inputStream (pseudoInputStream) received by EmbeddedDocExtractor can not 
be read, I think using EDE is not useful. How will this approach work with 
TikaCli --extract option? My original idea was to support an use case like 
TikaCli --extract...

Now I think this extraction of tables to files can be done handling the db as 
one big doc and using a ContentHandlerDecorator that will split the xhtml 
output at table boundaries. Each xhtml segment can be converted to a byte[] (if 
small) and then to a ByteArrayInputStream that can be handled by an 
EmbeddedDocDecorator, if setted into parseContext. If not setted the 
ContentHandlerDecorator do not need to split tables and can fallback to default 
behavior. A custom EDE can then extract tables to files if desired.

So now I think we could go with the big doc approah. What do you think?


was (Author: lfcnassif):
If the inputStream (pseudoInputStream) received by EmbeddedDocExtractor can not 
be read, I think using EDE is not useful. How will this approach work with 
TikaCli --extract option? My original idea was to support an use case like 
TikaCli --extract...

Now I think this extraction of tables to files can be done handling the db as 
one big doc and using a ContentHandlerDecorator that will split the xhtml 
output at table bondaries. Each xhtml segment can be converted to a byte[] (if 
small) and then to a ByteArrayInputStream that can be passed to a 
EmbeddedDocDecorator, if set on parseContext. If not set the 
ContentHandlerDecorator do not need to split tables and can fallBack to default 
behavior. A custom EDE can then extract tables to files if desired.

So now I think we could go with the big doc approah. What do you think?

 Create a parser for SQLite3
 ---

 Key: TIKA-1511
 URL: https://issues.apache.org/jira/browse/TIKA-1511
 Project: Tika
  Issue Type: New Feature
  Components: parser
Affects Versions: 1.6
Reporter: Luis Filipe Nassif
 Fix For: 1.8

 Attachments: TIKA-1511v1.patch, TIKA-1511v2.patch, testSQLLite3b.db


 I think it would be very useful, as sqlite is used as data storage by a wide 
 range of applications. Opening the ticket to track it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1511) Create a parser for SQLite3

2015-01-16 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280345#comment-14280345
 ] 

Tim Allison edited comment on TIKA-1511 at 1/16/15 3:08 PM:


First draft of patch attached.  Need to build out tests, obviously, and I'll 
fix spelling of SQLLite in the class names! :)

For the design, I created a public parser that called a new *DBParser class for 
each call to parse (like many other parsers) to avoid thread safety issues. 

The *DBParser, in turn, calls the EmbeddedDocumentExtractor for each table, and 
it specifies via special mime-type, which *TableParser will be called. 

The *TableParser ignores the empty InputStream, and grabs the 
StatementTablePair from the ParseContext to parse each table.

Also, as part of the design, the EmbeddedDocumentExtractor is called for each 
BLOB and each CLOB.

The jdbc wrapper around sqlite is not able to read CLOBs (apparently?), 
although I could write them without exception (doesn't mean they were actually 
written), and it does some other stuff that is not standard JDBC, but that is 
all handled in SQLiteTableParser, a subclass of AbstractTableParser.

Any and all feedback is welcomed.  This is still drafty.



was (Author: talli...@mitre.org):
First draft of patch attached.  Need to build out tests, obviously, and I'll 
fix spelling of SQLLite in the class names! :)

For the design, I had to create a public parser that called a new *DBParser 
class for each call to parse (like many other parsers) to avoid thread safety 
issues. 

The *DBParser, in turn, calls the EmbeddedDocumentParser for each table, and it 
specifies via special mime-type, which *TableParser will be called. 

The *TableParser ignores the InputStream, and grabs the StatementTablePair from 
the ParseContext to parse each table.

The jdbc wrapper around sqlite is not able to read CLOBs (apparently?), 
although I could write them without exception (doesn't mean they were actually 
written), and it does some other stuff that is not standard JDBC, but that is 
all handled in SQLiteTableParser, a subclass of AbstractTableParser.

Any and all feedback is welcomed.  This is still drafty.


 Create a parser for SQLite3
 ---

 Key: TIKA-1511
 URL: https://issues.apache.org/jira/browse/TIKA-1511
 Project: Tika
  Issue Type: New Feature
  Components: parser
Affects Versions: 1.6
Reporter: Luis Filipe Nassif
 Fix For: 1.8

 Attachments: TIKA-1511v1.patch, testSQLLite3b.db


 I think it would be very useful, as sqlite is used as data storage by a wide 
 range of applications. Opening the ticket to track it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1511) Create a parser for SQLite3

2015-01-13 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275534#comment-14275534
 ] 

Tim Allison edited comment on TIKA-1511 at 1/13/15 5:03 PM:


Looks like we're cleared via LEGAL-215 for xerial or anything else that wraps 
sqlite.  We can add some language about the underlying sqlite non-license and 
we should be good to go.

I think my preference for now would be to have an abstract base class (with at 
least these abstract methods: getConnection(), getTableNames(), 
addMetadata(Connection connection, Metadata metadata), close(Connection 
connection) that we can extend for each db parser. The abstract class would 
implement the select * from eachtable.  This plan would only work for 
jdbc-compliant dependencies that can return a Connection. It appears that this 
plan would work for xerial but not for sqlite4java...that said, I'm not above 
writing a separate parser for db-specific calls as with sqlite4java's 
SQLiteConnection. :)

I defer to the community on whether to go this route, the ManifoldCF route or 
another.

As [~grossws] recommended, we can build the parsers and then do a check for 
whether or not the drivers are available.  The user would be responsible for 
adding any non Apache licensable jars to their classpath.


was (Author: talli...@mitre.org):
Looks like we're cleared via LEGAL-215 for xerial.  We can add some language 
about the underlying sqlite non-license and we should be good to go.

I think my preference for now would be to have an abstract base class (with at 
least these abstract methods: getConnection(), getTableNames(), 
addMetadata(Connection connection, Metadata metadata), close(Connection 
connection) that we can extend for each db parser.  But I defer to the 
community.

As [~grossws] recommended, we can build the parsers and then do a check for 
whether or not the drivers are available.  The user would be responsible for 
adding any non Apache licensable jars to their classpath.

 Create a parser for SQLite3
 ---

 Key: TIKA-1511
 URL: https://issues.apache.org/jira/browse/TIKA-1511
 Project: Tika
  Issue Type: New Feature
  Components: parser
Affects Versions: 1.6
Reporter: Luis Filipe Nassif
 Fix For: 1.8


 I think it would be very useful, as sqlite is used as data storage by a wide 
 range of applications. Opening the ticket to track it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1511) Create a parser for SQLite3

2015-01-13 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275534#comment-14275534
 ] 

Tim Allison edited comment on TIKA-1511 at 1/13/15 5:04 PM:


Looks like we're cleared via LEGAL-215 for xerial or anything else that wraps 
sqlite.  We can add some language about the underlying sqlite non-license and 
we should be good to go.

I think my preference for now would be to have an abstract base class (with at 
least these abstract methods: getConnection(), getTableNames(), 
addMetadata(Connection connection, Metadata metadata), close(Connection 
connection)) that we can extend for each db parser. The abstract class would 
implement the select * from eachtable.  This plan would only work for 
jdbc-compliant-ish dependencies that can return a Connection. It appears that 
this plan would work for xerial but not for sqlite4java...that said, I'm not 
above writing a separate parser for db-specific calls as with sqlite4java's 
SQLiteConnection. :)

I defer to the community on whether to go this route, the ManifoldCF route or 
another.

As [~grossws] recommended, we can build the parsers and then do a check for 
whether or not the drivers are available.  The user would be responsible for 
adding any non Apache licensable jars to their classpath.


was (Author: talli...@mitre.org):
Looks like we're cleared via LEGAL-215 for xerial or anything else that wraps 
sqlite.  We can add some language about the underlying sqlite non-license and 
we should be good to go.

I think my preference for now would be to have an abstract base class (with at 
least these abstract methods: getConnection(), getTableNames(), 
addMetadata(Connection connection, Metadata metadata), close(Connection 
connection) that we can extend for each db parser. The abstract class would 
implement the select * from eachtable.  This plan would only work for 
jdbc-compliant dependencies that can return a Connection. It appears that this 
plan would work for xerial but not for sqlite4java...that said, I'm not above 
writing a separate parser for db-specific calls as with sqlite4java's 
SQLiteConnection. :)

I defer to the community on whether to go this route, the ManifoldCF route or 
another.

As [~grossws] recommended, we can build the parsers and then do a check for 
whether or not the drivers are available.  The user would be responsible for 
adding any non Apache licensable jars to their classpath.

 Create a parser for SQLite3
 ---

 Key: TIKA-1511
 URL: https://issues.apache.org/jira/browse/TIKA-1511
 Project: Tika
  Issue Type: New Feature
  Components: parser
Affects Versions: 1.6
Reporter: Luis Filipe Nassif
 Fix For: 1.8


 I think it would be very useful, as sqlite is used as data storage by a wide 
 range of applications. Opening the ticket to track it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TIKA-1511) Create a parser for SQLite3

2015-01-13 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275217#comment-14275217
 ] 

Tim Allison edited comment on TIKA-1511 at 1/13/15 1:59 PM:


Thank you, [~grossws]!  

Two questions:

1) On how to exclude the native libs...is it ok to require that people 
re-bundle, that is just get rid of the dependency in the pom and build from 
scratch? Is there a cleaner method?

2) Would it be better to require users who want SQLLite3 parsing to add xerial 
to their classpath?We'll probably need to do this for formats that don't 
have Apache friendly drivers (afaik: .mdb, .dbf , others?)


was (Author: talli...@mitre.org):
Thank you, [~grossws]!  

Two questions:

1) On how to exclude the native libs...is it ok to require that people 
re-bundle, that is just get rid of the dependency in the pom and build from 
scratch? Is there a cleaner method?

2) Would it be better to require users who want SQLLite3 parsing to add xerial 
to their classpath?  

 Create a parser for SQLite3
 ---

 Key: TIKA-1511
 URL: https://issues.apache.org/jira/browse/TIKA-1511
 Project: Tika
  Issue Type: New Feature
  Components: parser
Affects Versions: 1.6
Reporter: Luis Filipe Nassif
 Fix For: 1.8


 I think it would be very useful, as sqlite is used as data storage by a wide 
 range of applications. Opening the ticket to track it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)