Hey Keith…

Your question #3 made me curious, as I thought GitHub was a mirror, but 
https://devclass.com/2019/04/30/apache-heads-to-github/ 
<https://devclass.com/2019/04/30/apache-heads-to-github/> looks like Github is 
the authoritative repo.   The https://tika.apache.org/contribute.html 
<https://tika.apache.org/contribute.html> also says the same thing…

So yes, I think the title does need updating.   The Apache Spark’s Github 
description is “Apache Spark”, so we could be “Apache Tika”.

Not sure I can answer 1.   

As far as 2,  I find I typically use the tike-app jar unless I am carefully 
choosing which dependencies I want.

> On Sep 9, 2019, at 8:21 AM, Keith Bennett <[email protected]> wrote:
> 
> Hello, everyone. I am a Tika committer but have not been active for a long 
> time. I've been looking over the code and would appreciate if you could 
> answer some questions:
> 
> 1) There is a Jira issue (at 
> https://issues.apache.org/jira/browse/DRILL-6256?jql=text%20~%20%22readme%20java%207%22)
>  regarding the mention of Java 1.7 in the README 
> (https://github.com/apache/tika/blob/master/README.md). It was marked as 
> fixed, but I still see Java 7 mentioned. Tika should work with the most 
> recent versions of Java, right? Should we not update the readme accordingly? 
> I noticed that there is a "tika-java7" directory in the project consisting 
> solely of a TikaFileTypeDetector class. Can you help me understand what the 
> connection with Java version 7 is? Is it that Tika code should not use 
> features that were absent in Java 7 (such as lambdas)?
> 
> 2) I would like to bring "Rika" (https://github.com/ricn/rika), a Ruby 
> wrapper around Tika, up to date with respect to the dependency jar files 
> packaged with it. I thought I would check out the commit to which the 1.22 
> tag was attached, and do a fresh maven install, and use the files that were 
> installed ("~/.m2/repository/**/*jar"). Then again, Rika unconditionally 
> loads all the jar files; would it be faster to just use the jar file of the 
> Tika distribution (e.g. tika-app-1.22.jar) so that only one instead of n 
> files needs to be loaded? 
> 
> 3) The description for the Github repo at https://github.com/apache/tika says 
> "Tika Mirror". Is it really a mirror, or has it become the authoritative 
> source? (Given that I saw mentions of pull requests, I suspect the latter.) 
> If the latter, I suggest changing that text to something like "Tika 
> Authoritative Repository", as it is currently misleading.
> 
> Thanks,
> Keith
> 
> --    
> Keith R. Bennett
> about.me/keithrbennett
> 

_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | 
My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
    
This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.

  • Questions Keith Bennett
    • Re: Questions Eric Pugh

Reply via email to