Hello, To give you a bit of context, I wrote a java library that aims to provide an easy way to coordinate multiple MR jobs and execute them with a single jar submission. The final result is a "fat jar” (build using the maven assembly plugin) that contains the different Mapper and Reducer classes and a Main class that has the logic to submit the different jobs to the cluster.
To accomplish this, the Main relies on some text files (packaged in the jar) to be present. Those files are not needed by the MR jobs themselves, it’s some kind of configuration for the Main to know how it should schedule the different MR jobs. The jar is executed like that: hadoop jar the_jar_file.jar <args> It has been used in production for a long time now but recently we decided to upgrade to hadoop 2.6 (we were using 0.20). All our jobs packaged like that are failing because the Main cannot locate the text files in the classpath. I did a bit of debugging by replacing the Main with a piece of code that print the content of the classpath. When running the jar with: java -jar the_jar_file.jar <args> I can see the text files in the list. But when I run the same jar with: hadoop jar the_jar_file.jar <args> The text files are missing. I assume that something changed in the way the hadoop jar command read the jar and build the classpath. I found someone complaining about the same issue on stakoverflow (http://stackoverflow.com/questions/31670390/accessing-jar-resource-when-run-in-hadoop) but nobody replied. I would like to be able to keep the same mechanism (keep those conf files in the jar and access them at runtime from the classpath), maybe their is an options to alter the way the jar command behave? Can someone point me to the source code of the jar command? Thanks!
