Hey Keith… Your question #3 made me curious, as I thought GitHub was a mirror, but https://devclass.com/2019/04/30/apache-heads-to-github/ <https://devclass.com/2019/04/30/apache-heads-to-github/> looks like Github is the authoritative repo. The https://tika.apache.org/contribute.html <https://tika.apache.org/contribute.html> also says the same thing…
So yes, I think the title does need updating. The Apache Spark’s Github description is “Apache Spark”, so we could be “Apache Tika”. Not sure I can answer 1. As far as 2, I find I typically use the tike-app jar unless I am carefully choosing which dependencies I want. > On Sep 9, 2019, at 8:21 AM, Keith Bennett <[email protected]> wrote: > > Hello, everyone. I am a Tika committer but have not been active for a long > time. I've been looking over the code and would appreciate if you could > answer some questions: > > 1) There is a Jira issue (at > https://issues.apache.org/jira/browse/DRILL-6256?jql=text%20~%20%22readme%20java%207%22) > regarding the mention of Java 1.7 in the README > (https://github.com/apache/tika/blob/master/README.md). It was marked as > fixed, but I still see Java 7 mentioned. Tika should work with the most > recent versions of Java, right? Should we not update the readme accordingly? > I noticed that there is a "tika-java7" directory in the project consisting > solely of a TikaFileTypeDetector class. Can you help me understand what the > connection with Java version 7 is? Is it that Tika code should not use > features that were absent in Java 7 (such as lambdas)? > > 2) I would like to bring "Rika" (https://github.com/ricn/rika), a Ruby > wrapper around Tika, up to date with respect to the dependency jar files > packaged with it. I thought I would check out the commit to which the 1.22 > tag was attached, and do a fresh maven install, and use the files that were > installed ("~/.m2/repository/**/*jar"). Then again, Rika unconditionally > loads all the jar files; would it be faster to just use the jar file of the > Tika distribution (e.g. tika-app-1.22.jar) so that only one instead of n > files needs to be loaded? > > 3) The description for the Github repo at https://github.com/apache/tika says > "Tika Mirror". Is it really a mirror, or has it become the authoritative > source? (Given that I saw mentions of pull requests, I suspect the latter.) > If the latter, I suggest changing that text to something like "Tika > Authoritative Repository", as it is currently misleading. > > Thanks, > Keith > > -- > Keith R. Bennett > about.me/keithrbennett > _______________________ Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.
