If I understand correctly, if we release rc2, Tika 1.8 will break in Hadoop clusters across the land?! Or, Hadoop folks will have to apply a classloading workaround or rebuild 1.8/trunk with small version mod in TIKA-1606 to get Tika to work.
For most Hadoopites, this will be a straightforward fix, and I'm assuming that's why Ken is not more outspoken against releasing rc2 as is (Ken, let me know if I'm wrong!). For other users, though, say, in healthcare, where code security review is stringent, this could be a real pain, no? Am I understanding correctly what will happen? If so, do we really want to do this? -----Original Message----- From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Saturday, April 18, 2015 11:48 PM To: dev@tika.apache.org Subject: Re: [VOTE] Apache Tika 1.8 Release Candidate #2 +1 to pushing on Monday - if we have to roll a 1.9 quickly after, we can :) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Tyler Palsulich <tpalsul...@gmail.com> Reply-To: "dev@tika.apache.org" <dev@tika.apache.org> Date: Saturday, April 18, 2015 at 11:29 PM To: "dev@tika.apache.org" <dev@tika.apache.org> Subject: RE: [VOTE] Apache Tika 1.8 Release Candidate #2 >Hi Folks, > >If there are no blocking complaints (OSGi?) by Monday (a little longer >than >3 days, I realize), I'll mark this as passed and finish the release >process. > >Of course, it's no problem for me to cut another RC, if it's needed. > >Have a great weekend! >Tyler >I've run into one problem while testing Tika 1.8 with Bixo > >It involves a dependency issue involving (of course) Guava, since that >project loves to break their API :( > >The bixo-core jar has these transitive dependencies on various versions of >Guava: > >Hadoop - 11.0.2 >Cascading - 14.0.1 >Tika-parsers - 10.0.1 > cdm - 17.0 > >Everyone winds up using version 10.0.1 (note that Tika has a dependency on >cdm, which wants to use 17.0) > >The problem is that Hadoop (for any recent version) uses an API from >Guava's cache implementation that no longer exists: > >com.google.common.cache.CacheBuilder.build(Lcom/google/common/cache/CacheL >oader;)Lcom/google/common/cache/LoadingCache; >java.lang.NoSuchMethodError: >com.google.common.cache.CacheBuilder.build(Lcom/google/common/cache/CacheL >oader;)Lcom/google/common/cache/LoadingCache; > at >org.apache.hadoop.io.compress.CodecPool.createCache(CodecPool.java:62) > at >org.apache.hadoop.io.compress.CodecPool.<clinit>(CodecPool.java:74) > at >org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:1272) > at >org.apache.hadoop.mapred.SequenceFileOutputFormat$1.close(SequenceFileOutp >utFormat.java:79) > >So what this means is that anyone trying to use Tika with Hadoop will need >to play games with the class loader to get the older version of Guava - >though that can cause other issues if Hadoop (or Cascading, etc) rely on >anything that's only in the newer Guava API. > >Guava 1.0.01 was released about 3.5 years ago; 11.0.2 was from about 3 >years ago. So it seems like we should upgrade to at least 11.0.2 > >But I don't know if this is enough of an issue to require another RC. > >-- Ken > >PS - I've created https://issues.apache.org/jira/browse/TIKA-1606 to track >this. > > >> From: Tyler Palsulich >> Sent: April 13, 2015 10:56:29am PDT >> To: dev@tika.apache.org, u...@tika.apache.org >> Subject: [VOTE] Apache Tika 1.8 Release Candidate #2 >> >> Hi Folks, >> >> A candidate for the Tika 1.8 release is available at: >> https://dist.apache.org/repos/dist/dev/tika/ >> >> The release candidate is a zip archive of the sources in: >> http://svn.apache.org/repos/asf/tika/tags/1.8-rc2/ >> >> The SHA1 checksum of the archive is >> 5e22fee9079370398472e59082d171ae2d7fdd31. >> >> In addition, a staged maven repository is available here: >> https://repository.apache.org/content/repositories/orgapachetika-1009 >> >> Please vote on releasing this package as Apache Tika 1.8. The vote is >open for the next 72 hours and passes if a majority of at least three +1 >Tika PMC votes are cast. >> >> [ ] +1 Release this package as Apache Tika 1.8 >> [ ] ±0 I don't object to this release, but I haven't checked it >> [ ] -1 Do not release this package because... >> >> Thanks, >> Tyler > > >-------------------------- >Ken Krugler >+1 530-210-6378 >http://www.scaleunlimited.com >custom big data solutions & training >Hadoop, Cascading, Cassandra & Solr