That makes a lot of sense :) Thanks! On Fri, Jan 29, 2021 at 3:33 PM Mark Payne <marka...@hotmail.com> wrote:
> Zilvinas, > > That’s fantastic! I can’t say for certain without delving in a whole lot > deeper, but I would guess that the disk reads have dropped significantly > because disk caching is now much better utilized. Before, you were bringing > in a bunch of data and writing to disk. That would evict “old” data from > the cache. When you then went to read the FlowFile content, it would have > to go to disk. Now, you’re not holding as much data in the content > repository because the backpressure is preventing you from bringing in huge > amounts until you’ve dealt with what you have. As a result, you don’t evict > the content from the OS disk cache, so when you go to read, it’s just > coming from the disk cache. So that would mean way less disk reading and > way better throughput. If I recall correctly, you’re running on AWS with a > 200 GB EBS volume for the content repo (though I could be remembering a > different thread). That’s not a particularly fast volume, so it makes sense > that disk caching would give you dramatically better performance. > > Thanks > -Mark > > > > On Jan 29, 2021, at 10:16 AM, Zilvinas Saltys < > zilvinas.sal...@verizonmedia.com> wrote: > > Thanks Mark. > > That explains it. It makes sense as I know thread count is also per node. > I know from working with my team that it's counterintuitive to people. I > set the queue to max 1000 items but then I see 19,000 in a 25 node cluster. > It can be confusing. It's kinda obvious now that you've pointed it out > though. Perhaps expanding the tooltips to explain that the settings are per > node could be helpful. > > What I find very interesting if that after your suggested changes Disk > READ IO dropped on all nodes while performance improved. Please see the > attached screenshot. I would guess that when nodes have low amounts of > items in the queue it somehow manages to avoid some of the DISK IO between > processors. Is that possible? > > Thanks again, you've saved our NIFI ;) > > On Fri, Jan 29, 2021 at 2:20 PM Mark Payne <marka...@hotmail.com> wrote: > >> Zilvinas, >> >> The backpressure is per-node, not cluster-wide. So if you set a >> backpressure of 100, that means that each node will allow up to 100 >> FlowFiles into that connection. In fact, almost any of these types of >> settings in NiFi are per-node settings. The number of concurrent tasks, the >> thread pool size, backpressure, sizes of the repositories, etc. Generally, >> this works very nicely, because it allows you to ensure that each node is >> bringing in only the work that it can handle. >> >> Thanks >> -Mark >> >> On Jan 29, 2021, at 8:57 AM, Zilvinas Saltys < >> zilvinas.sal...@verizonmedia.com> wrote: >> >> Mark, >> >> Thank you for your detailed reply. Your suggestions are very helpful and >> at least for now it seems we've stopped accumulating large queues on some >> of the nodes. I do have additional questions about back pressure if you >> don't mind. I've watched all your videos, thanks for sharing them with me. >> Perhaps the next one could be about back pressure? >> >> The way I understand it is that back pressure is configured for the whole >> connection as an aggregate for all the nodes in the cluster. Meaning if we >> have a queue with 100 items and 10 nodes in the cluster then in the perfect >> scenario each node would have 10 items each. My question is how does it >> actually work internally? If a queue is configured at 100 items max and it >> now has gone down to 99 and there's now room for 1 more and there are 10 >> flowfiles upstream available on every host - which host on the cluster will >> get to fill that 1 slot? The only way I could think of this working is that >> when I define a backpressure of 100 that internally nifi sets every node's >> BP to 100/node_count. Is this how it works internally? >> >> Thanks for your help, >> Z. >> >> On Thu, Jan 28, 2021 at 8:48 PM Mark Payne <marka...@hotmail.com> wrote: >> >>> Zilvinas, >>> >>> That is accurate - when a connection is load balanced, the data is >>> pushed to a particular node based on the selected algorithm. It is not >>> continually rebalanced. >>> >>> So for a flow like this, my recommendation would be: >>> >>> 1) Set the backpressure threshold from FetchS3Object -> PublishKafka to >>> something “small.” Probably 1 or 2x the number of concurrent tasks that you >>> have for PublishKafka. That means that you’ll queue up little data there. >>> That will cause the data to instead build up on the connection between >>> EvaluateJsonPath -> FetchS3Object. >>> >>> 2) Set the backpressure threshold from EvaluateJsonPath -> FetchS3Object >>> to something small also. Maybe 2x-4x the number of nodes in the cluster. >>> This will result in data not queuing up here. As a result, as the size of >>> this queue gets smaller on one node, the data will begin to flow in and get >>> distributed to one of the nodes in the cluster. >>> >>> By keeping these queues small, essentially this will cause the “load >>> balanced” data to stay small. As the faster nodes work off their queue, it >>> will allow more data to flow in and be load balanced. The nodes that are >>> not performing well will still have backpressure applied so they won’t get >>> much of the data as it flows in. >>> >>> As your flow is right now, there’s no backpressure being hit in the >>> load-balanced queue. As a result, data streams in as fast as it can and >>> gets distributed across the cluster. And the nodes that can’t keep up >>> already have a huge backlog of SQS messages. So making this adjustment will >>> help to distribute the data as to the nodes as they become able to handle >>> it. >>> >>> Thanks >>> -Mark >>> >>> On Jan 28, 2021, at 3:27 PM, Zilvinas Saltys < >>> zilvinas.sal...@verizonmedia.com> wrote: >>> >>> My other issue is that the balancing is not rebalancing the queue? >>> Perhaps I misunderstand how balancing should work and it only balances >>> round robin new incoming files? I can easily manually rebalance by >>> disabling balancing and enabling it again but after a while it gets back to >>> the same situation where some nodes are getting worse and worse delayed >>> more and more and some remain fine. >>> >>> On Thu, Jan 28, 2021 at 8:22 PM Zilvinas Saltys < >>> zilvinas.sal...@verizonmedia.com> wrote: >>> >>>> Hi Joe, >>>> >>>> Yes it is the same issue. We have used your advice and reduced the >>>> amount of threads on our large processors: fetch/compress/publish to a >>>> minimum and then increased gradually to 4 until the processing rate became >>>> acceptable (about 2000 files per 5 min). This is a cluster of 25 nodes of >>>> 36 cores each. >>>> >>>> On Thu, Jan 28, 2021 at 8:19 PM Joe Witt <joe.w...@gmail.com> wrote: >>>> >>>>> I'm assuming also this is the same thing Maksym was asking about >>>>> yesterday. Let's try to keep the thread together as this gets discussed. >>>>> >>>>> On Thu, Jan 28, 2021 at 1:10 PM Pierre Villard < >>>>> pierre.villard...@gmail.com> wrote: >>>>> >>>>>> Hi Zilvinas, >>>>>> >>>>>> I'm afraid we would need more details to help you out here. >>>>>> >>>>>> My first question by quickly looking at the graph would be: there is >>>>>> a host (green line) where the number of queued flow files is more or less >>>>>> constantly growing. Where in the flow are the flow files accumulating for >>>>>> this node? What processor is creating back pressure? Do we have anything >>>>>> in >>>>>> the log for this node around the time where flow files start >>>>>> accumulating? >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>> Le ven. 29 janv. 2021 à 00:02, Zilvinas Saltys < >>>>>> zilvinas.sal...@verizonmedia.com> a écrit : >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> We run a 25 node Nifi cluster on version 1.12. We're processing >>>>>>> about 2000 files per 5 mins where each file is from 100 to 500 >>>>>>> megabytes. >>>>>>> >>>>>>> What I notice is that some workers degrade in performance and keep >>>>>>> accumulating a queued files delay. See attached screenshots where it >>>>>>> shows >>>>>>> two hosts where one is degraded. >>>>>>> >>>>>>> One seemingly dead give away is that the degraded node starts doing >>>>>>> heavy and intensive disk read io while the other node keeps doing none. >>>>>>> I >>>>>>> ran iostat on those nodes and I know that the read IOs are on the >>>>>>> content_repository directory. But it makes no sense to me how some of >>>>>>> the >>>>>>> nodes who are doing these heavy tasks are doing no disk read io. In this >>>>>>> example I know that both nodes are processing roughly the same amount of >>>>>>> files and of same size. >>>>>>> >>>>>>> The pipeline is somewhat simple: >>>>>>> 1) Read from SQS 2) Fetch file contents from S3 3) Publish file >>>>>>> contents to Kafka 4) Compress file contents 5) Put compressed contents >>>>>>> back >>>>>>> to S3 >>>>>>> >>>>>>> All of these operations to my understanding should require heavy >>>>>>> reads from local disk to fetch file contents from content repository? >>>>>>> How >>>>>>> is such a thing possible that some nodes are processing lots of files >>>>>>> and >>>>>>> are not showing any disk reads and then suddenly spike in disk reads and >>>>>>> degrade? >>>>>>> >>>>>>> Any clues would be really helpful. >>>>>>> Thanks. >>>>>>> >>>>>> >>> >> <Screenshot 2021-01-29 at 15.12.43.png> > > >