Re: Flink running same task on different Task Manager

Great Info Wed, 13 Jul 2022 08:12:02 -0700

thanks for helping with some inputs
actually, I have created task1 and task2 in separate slot groups,
thought it would be good if they run in independent slots. Also now facing
some issues during restarts. whenever  task1 has any exception entire job
is restarting.


Is there a way to set the restart strategy so that only tasks in the same
slot group will restart during failure
?

On Wed, Jun 15, 2022 at 6:13 PM Lijie Wang <wangdachui9...@gmail.com> wrote:

> Hi Great,
>
> Do you mean there is a Task1 and a Task2 on each task manager?
>
> If so, I think you can set Task1 and Task2 to the same parallelism and set
> them in the same slot sharing group. In this way, the Task1 and Task2 will
> be deployed into the same slot(That is, the same task manager).
>
> You can get more details about slot sharing group in [1], and you can get
> how to set slot sharing group in [2].
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/flink-architecture/#task-slots-and-resources
> [2]
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/overview/#set-slot-sharing-group
>
> Best，
> Lijie
>
> Weihua Hu <huweihua....@gmail.com> 于2022年6月15日周三 13:16写道：
>
>> I don't really understand how task2 reads static data from task1,
>> but I think you can integrate the logic of getting static data from http
>> in
>> task1 into task2 and keep only one kind of task.
>>
>> Best,
>> Weihua
>>
>>
>> On Wed, Jun 15, 2022 at 10:07 AM Great Info <gubt...@gmail.com> wrote:
>>
>> > thanks for helping with some inputs, yes I am using rich function and
>> > handling objects created in open, and also and network calls are getting
>> > called in a run.
>> > but currently, I got stuck running this same task on *all task managers*
>> > (nodes), when I submit the job, this task1(static data task) runs only
>> one
>> > task manager, I have 3 task managers in my Flink cluster.
>> >
>> >
>> > On Tue, Jun 14, 2022 at 7:20 PM Weihua Hu <huweihua....@gmail.com>
>> wrote:
>> >
>> >> Hi,
>> >>
>> >> IMO, Broadcast is a better way to do this, which can reduce the QPS of
>> >> external access.
>> >> If you do not want to use Broadcast, Try using RichFunction, start a
>> >> thread in the open() method to refresh the data regularly. but be
>> careful
>> >> to clean up your data and threads in the close() method, otherwise it
>> will
>> >> lead to leaks.
>> >>
>> >> Best,
>> >> Weihua
>> >>
>> >>
>> >> On Tue, Jun 14, 2022 at 12:04 AM Great Info <gubt...@gmail.com> wrote:
>> >>
>> >>> Hi,
>> >>> I have one flink job which has two tasks
>> >>> Task1- Source some static data over https and keep it in memory, this
>> >>> keeps refreshing it every 1 hour
>> >>> Task2- Process some real-time events from Kafka and uses static data
>> to
>> >>> validate something and transform, then forward to other Kafka topic.
>> >>>
>> >>> so far, everything was running on the same Task manager(same node),
>> but
>> >>> due to some recent scaling requirements need to enable partitioning on
>> >>> Task2 and that will make some partitions run on other task managers.
>> but
>> >>> other task managers don't have the static data
>> >>>
>> >>> is there a way to run Task1 on all the task managers? I don't want to
>> >>> enable broadcasting since it is a little huge and also I can not
>> persist
>> >>> data in DB due to data regulations.
>> >>>
>> >>>
>>
>

Re: Flink running same task on different Task Manager

Reply via email to