Hi, IMO, Broadcast is a better way to do this, which can reduce the QPS of external access. If you do not want to use Broadcast, Try using RichFunction, start a thread in the open() method to refresh the data regularly. but be careful to clean up your data and threads in the close() method, otherwise it will lead to leaks.
Best, Weihua On Tue, Jun 14, 2022 at 12:04 AM Great Info <gubt...@gmail.com> wrote: > Hi, > I have one flink job which has two tasks > Task1- Source some static data over https and keep it in memory, this > keeps refreshing it every 1 hour > Task2- Process some real-time events from Kafka and uses static data to > validate something and transform, then forward to other Kafka topic. > > so far, everything was running on the same Task manager(same node), but > due to some recent scaling requirements need to enable partitioning on > Task2 and that will make some partitions run on other task managers. but > other task managers don't have the static data > > is there a way to run Task1 on all the task managers? I don't want to > enable broadcasting since it is a little huge and also I can not persist > data in DB due to data regulations. > >