The data is distributed evenly on all nodes. It just accumulates in the queue just prior to writing to HDFS on two of the nodes for days or weeks or until the node runs our of memory. I first noticed this back in November, and I thought it was somewhat random, but this week I noticed that the coordinator seems to be the one that writes on all the clusters.
Pierre Villard wrote > Hi Ben, > > There shouldn't be any issue. I'm wondering: is the input processor of > your > workflow running "on primary node only" mode? If yes, then, unless you > specifically distribute the data on your nodes, the data will remain on > the > primary node from the beginning to the end of your workflow and only one > PutHDFS will actually write data. By any chance, would it be your case? > > -Pierre > > 2017-01-26 21:27 GMT+01:00 bmichaud < > ben_michaud@ > >: > >> This happened on NiFi 1.0.0 as well as 1.1.1 >> >> I have a flow that uses PutHDFS to write to a remote HDFS. It connects to >> this remote MapR system using JAAS kerberos authentication configured in >> the >> local MapR client. I build NiFi to incorporate the MapR libraries. >> >> Here is the flow writing data with data accumulated: >> <http://apache-nifi-developer-list.39713.n7.nabble.com/file/ > > n14535/QueueToPutHDFS_Flow.png> >> >> Here is the status history (flow files out) for the queue immediately >> preceding the PutHDFS processor: >> <http://apache-nifi-developer-list.39713.n7.nabble.com/file/ > > n14535/QueueToPutHDFS_FlowFilesOut_OneNode.png> >> >> Why does is only write from the coordinator? Any Idea? I'm willing to fix >> this in the code, and am planning on debugging into it to figure it out, >> but >> I was hoping some one out there could point me to the right place. >> >> I have to work around this by using PutFile to the local FS then manually >> moving the files to the remove box. I was going for something a little >> less >> manual. ;) >> >> Thanks! >> >> >> >> -- >> View this message in context: http://apache-nifi-developer- >> list.39713.n7.nabble.com/PutHDFS-does-not-write-on-all- >> nodes-of-a-cluster-only-the-on-the-coordinator-tp14535.html >> Sent from the Apache NiFi Developer List mailing list archive at >> Nabble.com. >> -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/PutHDFS-does-not-write-on-all-nodes-of-a-cluster-only-the-on-the-coordinator-tp14535p14538.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
