HI guys : I have an EMR job which seems to be loading "old" versions of an aws-sdk-java jar. I looked closer and found that the hadoop nodes im using in fact have a old version of a jar in $HOME/lib/ which causing the problem.
This is most commonly seen, for example, with jackson json jars. What is the simplest way to specify the correct version of the jar to take precedent ? Initially I have tried both 1) setting the "mapreduce.job.user.classpath.first" in my job, but that seems to have no effect. and 2) exporting the command line property "HADOOP_USER_CLASSPATH_FIRST" at the command line before launching my jobs. Neither seems to work properly it seems (caveat: admittedly these are just initial attempts... maybe i've done something minor incorrectly). But.. before i bang my head against the shell scripts --- Can somebody suggest an ideal way to force a jar to be the "priority" loaded jar across all mappers and reducers, i.e. overiding the hadoop classpath ? -- Jay Vyas MMSB/UCHC