You can try adjusting *maxUncommittedOffsets* and *offsetCommitPeriodMs*
variables according to load and number of bolts. By adjusting these two
params you're ensuring how many packets can stay in your topology and by
adjusting these, you can pretty much control the capacity.

On Fri, Apr 24, 2020 at 11:09 PM Abhishek Raj <[email protected]>
wrote:

> Hi,
>
> We have a topology which consumes data generated by multiple applications
> via Kafka. The data for one application is aggregated in a single bolt task
> using fieldsgrouping. All applications push data at different rates so some
> executors of the bolt are busier/overloaded than others and capacity
> distribution is non-uniform.
>
> The problem we're facing now is that when there's a spike in data produced
> by one (or more applications), capacity goes up for that executor, we see
> frequent gc pauses and eventually the corresponding jvm crashes causing
> worker restarts.
>
> As an ideal solution, we want to slow down only the application(s) which
> cause the spike. We cannot use the built in backpressure here because it
> happens at the spout level and slows down the entire pipeline.
>
> What are your thoughts on this? How can we fix this?
>
> Thanks
>

Reply via email to