> I am already using Tez (sorry, forgot to mention this), and my goal is
>indeed to build the instance once per container.

Put a log line in your UDF init() and check if it is being called multiple
times per container. If you¹re loading the data everytime, then that might
be something to fix.

The other aspect is that there¹s GC pauses that can happen due to that and
such extraneous reasons for the slow-down.

But first, look at how many times you are loading the distributed cache
data per container.

Cheers,
Gopal


Reply via email to