> From: Allison, Timothy B.
> Sent: April 20, 2015 5:11:04am PDT
> To: dev@tika.apache.org
> Subject: RE: [VOTE] Apache Tika 1.8 Release Candidate #2
> 
> If I understand correctly, if we release rc2, Tika 1.8 will break in Hadoop 
> clusters across the land?!
> Or, Hadoop folks will have to apply a classloading workaround or rebuild 
> 1.8/trunk with small version mod in TIKA-1606 to get Tika to work.
> 
> For most Hadoopites, this will be a straightforward fix, and I'm assuming 
> that's why Ken is not more outspoken against releasing rc2 as is (Ken, let me 
> know if I'm wrong!).  

Usually it's straightforward. Though whenever you start manipulating the 
classloader logic, you can get odd results.

E.g. by forcing your job jar's dependencies to show up first, now you can have 
an issue where one of your jars masks an older/newer version that Hadoop needs, 
so the job fails for some other reason.

But yes, I don't feel strongly enough about this to vote -1, as I don't think 
there are that many people using Tika with Hadoop.

For Bixo, I'd defer updating the Tika dependency until another version is 
released.

Don't know about Behemoth - Julien?

-- Ken


> For other users, though, say, in healthcare, where code security review is 
> stringent, this could be a real pain, no?
> 
> Am I understanding correctly what will happen?  If so, do we really want to 
> do this?
> 
> 
> -----Original Message-----
> From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] 
> Sent: Saturday, April 18, 2015 11:48 PM
> To: dev@tika.apache.org
> Subject: Re: [VOTE] Apache Tika 1.8 Release Candidate #2
> 
> +1 to pushing on Monday - if we have to roll a 1.9 quickly
> after, we can :)
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattm...@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Tyler Palsulich <tpalsul...@gmail.com>
> Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
> Date: Saturday, April 18, 2015 at 11:29 PM
> To: "dev@tika.apache.org" <dev@tika.apache.org>
> Subject: RE: [VOTE] Apache Tika 1.8 Release Candidate #2
> 
>> Hi Folks,
>> 
>> If there are no blocking complaints (OSGi?) by Monday (a little longer
>> than
>> 3 days, I realize), I'll mark this as passed and finish the release
>> process.
>> 
>> Of course, it's no problem for me to cut another RC, if it's needed.
>> 
>> Have a great weekend!
>> Tyler
>> I've run into one problem while testing Tika 1.8 with Bixo
>> 
>> It involves a dependency issue involving (of course) Guava, since that
>> project loves to break their API :(
>> 
>> The bixo-core jar has these transitive dependencies on various versions of
>> Guava:
>> 
>> Hadoop - 11.0.2
>> Cascading - 14.0.1
>> Tika-parsers - 10.0.1
>>       cdm - 17.0
>> 
>> Everyone winds up using version 10.0.1 (note that Tika has a dependency on
>> cdm, which wants to use 17.0)
>> 
>> The problem is that Hadoop (for any recent version) uses an API from
>> Guava's cache implementation that no longer exists:
>> 
>> com.google.common.cache.CacheBuilder.build(Lcom/google/common/cache/CacheL
>> oader;)Lcom/google/common/cache/LoadingCache;
>> java.lang.NoSuchMethodError:
>> com.google.common.cache.CacheBuilder.build(Lcom/google/common/cache/CacheL
>> oader;)Lcom/google/common/cache/LoadingCache;
>>       at
>> org.apache.hadoop.io.compress.CodecPool.createCache(CodecPool.java:62)
>>       at
>> org.apache.hadoop.io.compress.CodecPool.<clinit>(CodecPool.java:74)
>>       at
>> org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:1272)
>>       at
>> org.apache.hadoop.mapred.SequenceFileOutputFormat$1.close(SequenceFileOutp
>> utFormat.java:79)
>> 
>> So what this means is that anyone trying to use Tika with Hadoop will need
>> to play games with the class loader to get the older version of Guava -
>> though that can cause other issues if Hadoop (or Cascading, etc) rely on
>> anything that's only in the newer Guava API.
>> 
>> Guava 1.0.01 was released about 3.5 years ago; 11.0.2 was from about 3
>> years ago. So it seems like we should upgrade to at least 11.0.2
>> 
>> But I don't know if this is enough of an issue to require another RC.
>> 
>> -- Ken
>> 
>> PS - I've created https://issues.apache.org/jira/browse/TIKA-1606 to track
>> this.
>> 
>> 
>>> From: Tyler Palsulich
>>> Sent: April 13, 2015 10:56:29am PDT
>>> To: dev@tika.apache.org, u...@tika.apache.org
>>> Subject: [VOTE] Apache Tika 1.8 Release Candidate #2
>>> 
>>> Hi Folks,
>>> 
>>> A candidate for the Tika 1.8 release is available at:
>>>  https://dist.apache.org/repos/dist/dev/tika/
>>> 
>>> The release candidate is a zip archive of the sources in:
>>>  http://svn.apache.org/repos/asf/tika/tags/1.8-rc2/
>>> 
>>> The SHA1 checksum of the archive is
>>>  5e22fee9079370398472e59082d171ae2d7fdd31.
>>> 
>>> In addition, a staged maven repository is available here:
>>>  https://repository.apache.org/content/repositories/orgapachetika-1009
>>> 
>>> Please vote on releasing this package as Apache Tika 1.8. The vote is
>> open for the next 72 hours and passes if a majority of at least three +1
>> Tika PMC votes are cast.
>>> 
>>> [ ] +1 Release this package as Apache Tika 1.8
>>> [ ] ±0 I don't object to this release, but I haven't checked it
>>> [ ] -1 Do not release this package because...
>>> 
>>> Thanks,
>>> Tyler
>> 
>> 
>> --------------------------
>> Ken Krugler
>> +1 530-210-6378
>> http://www.scaleunlimited.com
>> custom big data solutions & training
>> Hadoop, Cascading, Cassandra & Solr
> 

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr





--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr





Reply via email to