Currently it means that the dataflow manager/developer is expected to set the 'Execution Nodes' strategy to "Primary Node" at the time of flow design.
We don't have anything that restricts the scheduling strategy of a processor, but we probably should consider having an annotation like @PrimaryNodeOnly that you can put on a processor and then the framework will enforce that it can only be scheduled on primary node. In the case of ListFile, I think the statement in the documentation is only partially true... When "Input Directory Location" is set to local, there should be no issue with scheduling the processor on all nodes in the cluster, as it would be listing a local directory and storing state locally. When "Input Directory Location" is set to remote, it wouldn't make sense to have all nodes listing the same remote directory and getting the same results, and also the state is then stored in ZooKeeper under a ZNode using the processor's UUID, and the processor has the same UUID on each node so they would be overwriting each other's state in ZK. So ListFile probably can't be restricted to primary node only, where as something like ListHDFS probably could because it is always listing a remote destination. On Fri, Feb 9, 2018 at 10:55 PM, Sivaprasanna <sivaprasanna...@gmail.com> wrote: > I was going through ListFile processor's code and found out that in the > documentation > <https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ListFile.java#L72-L76>, > it is mentioned that "this processor is designed to run on Primary Node > only in a cluster". I want to understand what "designed" stands for here. > Does that mean the processor was built in a way that it only runs on the > Primary node regardless of the "Execution Nodes" strategy set to otherwise > or does it mean that dataflow manager/developer is expected to set the > 'Execution Nodes' strategy to "Primary Node" at the time of flow design? If > it is of the former case, how is it handled in the code? If it is handled, > it should be in the framework side but I don't see any annotation > indicating anything related to such mechanism in the processor code and > more over a related JIRA NIFI-543 > <https://issues.apache.org/jira/browse/NIFI-543> is also open so I want > clear my doubt. > > - > Sivaprasanna