Hello everyone, 

I am experiencing some problems in importing external non-java libraries into 
my hadoop code. I am trying to run my code on a grid.
I followed the instructions here:
http://hadoop.apache.org/common/docs/current/native_libraries.html#Native+Shared+Libraries
but I failed.

I have a JNI module, let's call it "myModule.so", and I have to use it in the 
Reduce step of my workflow. This module must be in the java classpath
 
A quick description of what I am doing:
 
1) Upload myModule.so to my home on HDFS

2) In the Driver code, I write the following commands:
 
public int run(String[] args) throws Exception
{
    Configuration conf = getConf();
    DistributedCache.createSymlink(conf);
        
    DistributedCache.addCacheFile(new 
URI("/myHDFShome/myModule.so#myModule.so"), conf);
    // here I've also tried addFileToClasshPath() method and URI in the form: 
"hdfs://address:port/myHDFShome/myModule.so#myModule.so"       
    
    //I've tried with and without the following two instructions
    conf.set("mapred.child.java.opts", 
"-Djava.library.path=/myHDFShome/myModule.so");
    conf.set("mapred.child.env", 
"LD_LIBRARY_PATH=/myHDFShome/myModule.so:$LD_LIBRARY_PATH");
    //I've tried also:
    //conf.set("mapred.child.java.opts", "-Djava.library.path=./myModule.so");
   // conf.set("mapred.child.env", 
"LD_LIBRARY_PATH=./myModule.so:$LD_LIBRARY_PATH");

    Job job = new Job(conf, "JobName");
 
    ...    
}
 
3) In the Reducer code, I wrote
 
public void reduce(...)
{
    System.loadLibrary("myModule.so");
    ...
}

The System.loadLibrary("myModule.so") fails returning: "Error: no myModule in 
java.library.path"
More specifically, the Recucers return the following exception:
org.apache.hadoop.mapred.Child: Error running child : 
java.lang.UnsatisfiedLinkError : no myModule.so in java.library.path

I also tried different solutions, like setting the parameters using command 
line options, like described in
http://hadoop.apache.org/mapreduce/docs/current/mapred_tutorial.html#DistributedCache

In a nutshell, what is the correct way to import external modules when running 
distributed nodes?
Thank you!! 

Luca

Reply via email to