Github user cestella commented on the issue:
    Just FYI, as part of the performance experimentation in the lab here, we 
found that one major impediment to scale was the guava cache in this topology 
when the size of the cache becomes non-trivial in size (e.g. 10k+).  Swapping 
out [Caffeine]( immediately had a 
substantial affect.  I created #947 to migrate the split/join infrastructure to 
use caffeine as well and will look at the performance impact of that change.  I 
wanted to separate that work from here as it may be that guava performance is 
fine outside of an explicit threadpool like we have here.


