Ok, turns out those same classes do exist in the new API, but weren't included in the 0.20.2 release for some reason - they're in a much more recent SVN commit:
http://svn.apache.org/repos/asf/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/lib/join/ I've had to checkout the latest mapreduce revision and build it manually; it looks like the "zipgroupfileset" jar attribute is my best bet for flattening and merging the new mapreduce jar with the overall core hadoop jar. Hopefully this won't cause any flagrant problems... Shannon On Mon, Aug 2, 2010 at 6:24 PM, Shannon Quinn <[email protected]> wrote: > CompositeInputFormat implements a hadoop.mapred.join interface, whereas > job.setInputFormatClass() is expecting a class that extends a hadoop.ioclass. > Also, TupleWritable is in the deprecated hadoop.mapred package, too. > > Still hunting around the API for the newer equivalent; there has to be a > way of doing this? > > On Mon, Aug 2, 2010 at 6:20 PM, Jake Mannix <[email protected]> wrote: > >> On Mon, Aug 2, 2010 at 3:13 PM, Shannon Quinn <[email protected]> wrote: >> > >> > Excellent. Any idea what the Hadoop 0.20.2 equivalent for >> > CompositeInputFormat is? :) >> > >> >> Ah, there is that part. Hmm... it's really really annoying to not have >> that >> in 0.20.2. >> >> This is actually why I haven't migrated the distributed matrix stuff to >> the >> newest >> Hadoop API - map-side join is pretty seriously useful sometimes. >> >> Does the old CompositeInputFormat work with the new API, does anyone know? >> >> -jake >> > >
