Hello, everyone. I am a Tika committer but have not been active for a long 
time. I've been looking over the code and would appreciate if you could answer 
some questions:

1) There is a Jira issue (at 
https://issues.apache.org/jira/browse/DRILL-6256?jql=text%20~%20%22readme%20java%207%22)
 regarding the mention of Java 1.7 in the README 
(https://github.com/apache/tika/blob/master/README.md). It was marked as fixed, 
but I still see Java 7 mentioned. Tika should work with the most recent 
versions of Java, right? Should we not update the readme accordingly? I noticed 
that there is a "tika-java7" directory in the project consisting solely of a 
TikaFileTypeDetector class. Can you help me understand what the connection with 
Java version 7 is? Is it that Tika code should not use features that were 
absent in Java 7 (such as lambdas)?

2) I would like to bring "Rika" (https://github.com/ricn/rika), a Ruby wrapper 
around Tika, up to date with respect to the dependency jar files packaged with 
it. I thought I would check out the commit to which the 1.22 tag was attached, 
and do a fresh maven install, and use the files that were installed 
("~/.m2/repository/**/*jar"). Then again, Rika unconditionally loads all the 
jar files; would it be faster to just use the jar file of the Tika distribution 
(e.g. tika-app-1.22.jar) so that only one instead of n files needs to be 
loaded? 

3) The description for the Github repo at https://github.com/apache/tika says 
"Tika Mirror". Is it really a mirror, or has it become the authoritative 
source? (Given that I saw mentions of pull requests, I suspect the latter.) If 
the latter, I suggest changing that text to something like "Tika Authoritative 
Repository", as it is currently misleading.

Thanks,
Keith

--      
Keith R. Bennett
about.me/keithrbennett

Reply via email to