Github user arkadius commented on the issue:
https://github.com/apache/nifi/pull/3111
Hi @markap14 . Thank you for your reply. We have a quite heavy-loaded
environment. We are processing about 5 GB / min of messages on 32 core host.
Avro serialization was our main bottleneck. We had avg 30 threads concurrently
using this service. CPU usage was on 85%, iowaits near 0, load on 50.
Increasing of number of threads didn't help. As I wrote in issue, I found out
that the synchronization is an issue by sampling (doing occasional thread
dumps). Most of thread was blocked on it. After this modification, we stopped
to occur such a performance issue.
Regards to removing of limitation of entries in the cache. I agree with you
that it isn't the best approach. Limitation of 20 schemas wasn't either. In
systems with multiple flows, commonly using the same service, this limit would
be constantly hit. What do you think about introducing two parameters to this
service: Cache Size Limit and Time To Leave? After hitting one of them, entry
would be evicted. Condition would be checked during each entry get. Similar to
`cachedSchema.isOlderThan` check in `CachingSchemaRegistryClient`. If we assume
that these limits would be soft one, it won't need synchronization.
---