[ 
https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239062#comment-15239062
 ] 

Tim Allison edited comment on TIKA-1513 at 4/13/16 10:52 AM:
-------------------------------------------------------------

[~gagravarr], would you mind taking a look at the detector?  Is there a way 
that we can convert this to a mime definition?  Or should we add a DBFDetector?

[~nicholasc], it looks great to me.  I agree that we'll probably want to relax 
some of the length checks (just make sure they're > 0 or something 
reasonable)...we wouldn't want this to fail on truncated dbfs, and as you've 
pointed out, there can be extra bytes at the end of the file.  If there's any 
way to avoid adding the dependency, that'd be great...although, I very much 
appreciate the concern for overflow!

In your experience, do we need to validate the fieldentry or can we stop 
sooner?  If we do, then I suspect there's no way to convert to a mime 
definition, but I suspect much of the earlier stuff could easily be translated.

Oh, and please make sure to add an Apache license header...unless Nick B can 
easily translate this to a mime definition. :)

Thank you!


was (Author: [email protected]):
[~gagravarr], would you mind taking a look at the detector?  Is there a way 
that we can convert this to a mime definition?  Or should we add a DBFDetector?

[~nicholasc], it looks great to me.  I agree that we'll probably want to relax 
some of the length checks (just make sure they're > 0 or something 
reasonable)...we wouldn't want this to fail on truncated dbfs, and as you've 
pointed out, there can be extra bytes at the end of the file.  If there's any 
way to avoid adding the dependency, that'd be great...although, I very much 
appreciate the concern for overflow!

In your experience, do we need to validate the fieldentry or can we stop 
sooner?  If we do, then I suspect there's no way to convert to a mime 
definition, but I suspect much of the earlier stuff could easily be translated.

Thank you!

> Add mime detection and parsing for dbf files
> --------------------------------------------
>
>                 Key: TIKA-1513
>                 URL: https://issues.apache.org/jira/browse/TIKA-1513
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>             Fix For: 1.13
>
>
> I just came across an Apache licensed dbf parser that is available on 
> [maven|https://repo1.maven.org/maven2/org/jamel/dbf/dbf-reader/0.1.0/dbf-reader-0.1.0.pom].
> Let's add dbf parsing to Tika.
> Any other recommendations for alternate parsers?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to