On Nov 29, 2010, at 4:22 PM, Owen O'Malley wrote: > I do not support adding new dependencies to the classpath of MapReduce user >> tasks. > > > That isn't reasonable. As Hadoop evolves, we have and will continue to add > dependences. For example, in your last MapReduce (MAPREDUCE-980) patch you > added avro and paranamer as dependences. >
As a non PMC member: Hadoop has already put enough stuff on the classpath to force me to make a custom build to use (in 0.19 was the start, and now no distribution can work without modification). This is because of it stuffing more and more things on the classpath. It is completely reaonable to ask that the environment that user code runs in not be polluted with libraries that are not exposed in the Hadoop API, and debate the merits of a patch based on the inclusion of an additional jar on that classpath. Webapp containers, OSGi, other classloader systems, or dependency rebasing (jarjar links, maven shade, etc) help solve this sort of mess.Even more crudely, the user's lib directory doesn't have to be Hadoop's full lib directory, and the order of inclusion of jars can help. Either way, if Hadoop wants to be an application execution framework, it can't just throw whatever it wants on the classpath forever. If one wants to provide lots of tools as part of a rich environment for users, the user has to either be able to easily _opt in_ to having those tools available on their class path or _opt out_ of having them there. Now, this is really a tangent to other issues at hand. I'd like to suggest that rather than point fingers at who added what to what classpath and when, it is just noted that classpath management is a problem that Hadoop needs to solve and not ignore. I'm pretty sure there's a JIRA on it somewhere already. Until it is solved to some degree (since on a scale of 1 to 10 dealing with classpath collisions, Hadoop is currently somewhere between 0 and 1), its going to limit what can be built without causing user applications to break on an upgrade. Whether those new features are good or bad on its own merits is being conflated with classpath problems that it introduces for users. > -- Owen
