[ https://issues.apache.org/jira/browse/NIFI-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427062#comment-15427062 ]
Joseph Witt commented on NIFI-2562: ----------------------------------- So my initial thought is that if NiFi, using the Hadoop client, believes the data was properly written then there might be very little we can do. Is there a newer hadoop client you can use or have tried? You mentioned trying a custom writer. Can you compare its use of the client libs to NiFi's? > PutHDFS writes corrupted data in the transparent disk encryption zone > --------------------------------------------------------------------- > > Key: NIFI-2562 > URL: https://issues.apache.org/jira/browse/NIFI-2562 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework > Affects Versions: 0.6.0 > Reporter: Vik > Priority: Blocker > Labels: encryption, security > > Problem 1: UnknownHostExcepion > When NiFi is trying to ingest files into HDFS encryption zone, it was > throwing UnknownHostException > Reason: In hadoop Configuration files, like core-site.xml and hdfs-site.xml, > kms hosts were mentioned in the following format "h...@xxxxxxx1.int.xxxx.com; > xxxxxxx2.int.xxxx.com:16000". > Since NiFi was using old hadoop libraries (2.6.2), It could not resolve two > hosts. So instead it considered two hosts as a single host and started > throwing UnknownHostExcepion. > We tried a couple different fixes for this. > Fix 1: Changing configuration files from having property like: > <property> <name>hadoop.security.key.provider.path</name> > <value>kms://h...@xxxxxxxx.int.xxxx.com; > xxxxxxxx.int.xxxx.com:16000/kms</value> </property> > to: > <property> <name>hadoop.security.key.provider.path</name> > <value>kms://h...@xxxxxxxx.int.xxxx.com:16000/kms</value> </property> > > Fix 2: Building NiFi nar files with hadoop version, as installed in our > system. (2.6.0-cdh5.7.0). > Steps followed: > a) Changed NiFi pom file hadoop version from 2.6.2 to 2.6.0-cdh5.7.0. > b) Run mvn clean package -DskipTests > c) Copy following nar files to /opt/nifi-dev<number>/lib > ./nifi-nar-bundles/nifi-hadoop-bundle/nifi-hadoop-nar/target/nifi-hadoop-nar-1.0.0-SNAPSHOT.nar > ./nifi-nar-bundles/nifi-hadoop-libraries-bundle/nifi-hadoop-libraries-nar/target/nifi-hadoop-libraries-nar-1.0.0-SNAPSHOT.nar > ./nifi-nar-bundles/nifi-hbase-bundle/nifi-hbase-nar/target/nifi-hbase-nar-1.0.0-SNAPSHOT.nar > ./nifi-nar-bundles/nifi-standard-services/nifi-http-context-map-bundle/nifi-http-context-map-nar/target/nifi-http-context-map-nar-1.0.0-SNAPSHOT.nar > d) Restart NiFi with bin/nifi.sh restart > This fixes resolved the Unknown Host Exception for us but we ran into Problem > 2 mentioned below. > Problem 2: Ingesting Corrupted data into HDFS encryption zone > After resolving the UnknownHostException, NiFi was able to ingest files into > encryption zone but content of the file is corrupted. > Approaches: > Tried to simulate error with sample Java program which uses similar logic and > same library, but it was ingesting files into encryption zone without any > problem. > Checked NiFi log files to find the cause, found NiFi is making HTTP requests > to kms to decrypt keys but could not proceed further as there is no error. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)