On Mar 8, 2011, at 1:21 PM, Ratner, Alan S (IS) wrote:
> We had tried putting all the libraries directly in HDFS with a pointer in
> mapred-site.xml:
> <property><name>mapred.child.env</name><value>LD_LIBRARY_PATH=/user/ngc/lib</value></property>
> as described in https://issues.apache.org/jira/browse/HADOOP-2838 but this
> did not work for us.
Correct. This isn't expected to work.
HDFS files are not directly accessible from the shell without some sort
of action having taken place. In order for the above to work, anything
reading the LD_LIBRARY_PATH environment variable would have to know that
'/user/...' is a) inside HDFS and b) know how to access it. The reason why
the distributed cache method works is because it pulls files from HDFS and
places them in the local UNIX file system. From there, UNIX processes can now
access them.
HADOOP-2838 is really about providing a way for applications to get to
libraries that are already installed at the UNIX level. (Although, in reality,
it would likely be better if applications were linked with a better value
provided for the runtime library search path -R/-rpath/ld.so.conf/crle/etc
rather than using LD_LIBRARY_PATH.)