I have a flow running on 3 nodes (Nifi 1.3 on Redhat 7) that forks into two
branches that terminate with two PutHDFS that both write to the same Hadoop
cluster. I’m trying to use Snappy compression for both of the PutHDFS. The
thing is, one of them works fine, and the other throws “native snappy library
not available”. The behavior is the same across all nodes of the cluster. I
have no idea how this is possible though—either the JVM has access to the
libraries—or it doesn’t… or so I thought :-/
Things I’ve looked at/tried:
Running hadoop checknative –a shows that the snappy native library is indeed
installed. If I run a hadoop command with debug on, I can see that the native
libs load correctly. For Nifi, I can see from this warning in the logs that
they are not loading correctly:
2017-06-19 17:39:10,553 WARN [StandardProcessScheduler Thread-5]
org.apache.hadoop.util.NativeCodeLoader Unable to load native-hadoop library
for your platform... using builtin-java classes where applicable
But I’ve checked and I am using the exact same -Djava.library.path with Nifi as
the hadoop command ($HADOOP_HOME/lib/native).
Other things I’ve tried:
Setting JAVA_LIBRARY_PATH in bin/nifi-env.sh
Setting LD_LIBRARY_PATH in bin/nifi-env.sh
Copying the .so file into ./jre/lib of the java install
Here’s the error that the problematic PutHDFS outputs (it writes a 0 byte file
when it does this):
2017-06-19 11:59:54,764 ERROR [Timer-Driven Process Thread-7]
o.apache.nifi.processors.hadoop.PutHDFS
PutHDFS[id=5adbf195-015c-1000-0000-00000bda8639] Failed to write to HDFS due to
java.lang.RuntimeException: native snappy library not available: this version
of libhadoop was built without snappy support.: {}
Any idea what I need to do here?
TIA,
Clark