I'd like to get contrib jars added to the general hadoop CLASSPATH so that contrib classes are available to MR jobs. For example, a task that wants to read from an hbase table will need to be able to load the contrib class org.apache.hadoop.hbase.mapred.TableInputFormat. The alternative, requiring that every job jar include its contrib dependencies doesn't seem user friendly.

HADOOP-1648 is about messing w/ the bin/hadoop CLASSPATH construction to shoe horn in contrib jars but it doesn't yet handle contrib lib directories. hbase, at least, has its own lib subdirectory of a couple of jars that it depends on that are not in HADOOP_HOME/lib. HADOOP-1648 should add a general copy of contrib lib directories so they too can be added to the hadoop CLASSPATH.

Where do folks think these extra dependencies should land when running the ant 'package' target? Into contrib/lib along side all of the contrib jars or contrib/CONTRIB_NAME/lib or into the main hadoop lib dir in a subdir named contrib? (This latter would seem to make most sense to me -- its contents could be added after those of lib).

Thanks,
St.Ack


Reply via email to