[
https://issues.apache.org/jira/browse/NIFI-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687745#comment-15687745
]
Bryan Bende commented on NIFI-3068:
-----------------------------------
Using the latest code from master (1.1-SNAPSHOT) I have been unable to
reproduce a scenario where a PutHDFS processor writes to the wrong cluster.
I did determine that there appears to be some shared state in the Hadoop client
related to security. The scenario was the following:
- One PutHDFS processor writing to a kerberized HDFS
- Start a second PutHDFS processor writing to a non-secure HDFS, writes
successfully
- The first PutHDFS processor now gets an error:
{code}
2016-11-21 22:05:43,610 ERROR [Timer-Driven Process Thread-2]
o.apache.nifi.processors.hadoop.PutHDFS
PutHDFS[id=01581004-7069-19ef-5ec2-87b728465117] Failed to write to HDFS due to
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not
enabled. Available:[TOKEN, KERBEROS]:
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not
enabled. Available:[TOKEN, KERBEROS]
2016-11-21 22:05:43,612 ERROR [Timer-Driven Process Thread-2]
o.apache.nifi.processors.hadoop.PutHDFS
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not
enabled. Available:[TOKEN, KERBEROS]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method) ~[na:1.8.0_74]
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
~[na:1.8.0_74]
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
~[na:1.8.0_74]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
~[na:1.8.0_74]
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
~[hadoop-common-2.7.3.jar:na]
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
~[hadoop-common-2.7.3.jar:na]
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2110)
~[hadoop-hdfs-2.7.3.jar:na]
at
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
~[hadoop-hdfs-2.7.3.jar:na]
at
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
~[hadoop-hdfs-2.7.3.jar:na]
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
~[hadoop-common-2.7.3.jar:na]
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
~[hadoop-hdfs-2.7.3.jar:na]
at
org.apache.nifi.processors.hadoop.PutHDFS.onTrigger(PutHDFS.java:255)
~[nifi-hdfs-processors-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
[nifi-api-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
{code}
If the processor with the error is started and stopped it gets back to a
working state, and both processors are working at the same time again.
I've tested adding the @RequiresInstanceClassLoading to PutHDFS which will
guarantee that each instance of the processor has its own ClassLoader and thus
can't share any state between instances, and this resolves the problem.
I will attach a patch adding the annotation.
> NiFi can not reliably support multiple HDFS clusters in the same flow
> ---------------------------------------------------------------------
>
> Key: NIFI-3068
> URL: https://issues.apache.org/jira/browse/NIFI-3068
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 1.0.0
> Reporter: Sam Hjelmfelt
> Assignee: Bryan Bende
> Labels: HDFS
>
> The HDFS configurations in PutHDFS are not respected when two (or more)
> PutHDFS processors exist with different configurations. The second processor
> to run will use the configurations from the first processor. This can cause
> data to be written to the wrong cluster.
> This appears to be caused by configuration caching in
> AbstractHadoopProcessor, which would affect all HDFS processors.
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/AbstractHadoopProcessor.java#L144
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)