I have a flow running on 3 nodes (Nifi 1.3 on Redhat 7) that forks into two 
branches that terminate with two PutHDFS that both write to the same Hadoop 
cluster.  I’m trying to use Snappy compression for both of the PutHDFS.  The 
thing is, one of them works fine, and the other throws “native snappy library 
not available”. The behavior is the same across all nodes of the cluster.  I 
have no idea how this is possible though—either the JVM has access to the 
libraries—or it doesn’t… or so I thought :-/  

Things I’ve looked at/tried:
Running hadoop checknative –a shows that the snappy native library is indeed 
installed.  If I run a hadoop command with debug on, I can see that the native 
libs load correctly.  For Nifi, I can see from this warning in the logs that 
they are not loading correctly:

2017-06-19 17:39:10,553 WARN [StandardProcessScheduler Thread-5] 
org.apache.hadoop.util.NativeCodeLoader Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable

But I’ve checked and I am using the exact same -Djava.library.path with Nifi as 
the hadoop command ($HADOOP_HOME/lib/native).

Other things I’ve tried:
Setting JAVA_LIBRARY_PATH in bin/nifi-env.sh
Setting LD_LIBRARY_PATH in bin/nifi-env.sh
Copying the .so file into ./jre/lib of the java install

Here’s the error that the problematic PutHDFS outputs (it writes a 0 byte file 
when it does this):
2017-06-19 11:59:54,764 ERROR [Timer-Driven Process Thread-7] 
o.apache.nifi.processors.hadoop.PutHDFS 
PutHDFS[id=5adbf195-015c-1000-0000-00000bda8639] Failed to write to HDFS due to 
java.lang.RuntimeException: native snappy library not available: this version 
of libhadoop was built without snappy support.: {}

Any idea what I need to do here?

TIA,
Clark


Reply via email to