please see this!















At 2021-11-24 13:58:03, "Lidong Dai" <[email protected]> wrote:
>I think you can check the runtime log to find some warn/error message in
>master server and worker server when you received the hung up alarm.
>
>
>Best Regards
>
>
>
>---------------
>Apache DolphinScheduler PMC Chair
>LidongDai
>[email protected]
>Linkedin: https://www.linkedin.com/in/dailidong
>Twitter: @WorkflowEasy <https://twitter.com/WorkflowEasy>
>---------------
>
>
>On Mon, Nov 22, 2021 at 10:54 AM 王峰 <[email protected]> wrote:
>
>> 3 nodes, 2master/worker are all on the same machine, there is no downtime,
>> but the server service has hung up the alarm. I guess that insufficient
>> machine resources have affected the operation of the server, and fault
>> tolerance has occurred. The actual task after the error identification is
>> returned It did not stop, and a new task instance was started on the new
>> server.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> At 2021-11-21 18:41:49, "Lidong Dai" <[email protected]> wrote:
>> >hi,
>> >can you describe the question clearly? the host load means the Master
>> >or the Worker server? is there any server down?
>> >
>> >Best Regards
>> >
>> >
>> >
>> >---------------
>> >Apache DolphinScheduler PMC Chair
>> >LidongDai
>> >[email protected]
>> >Linkedin: https://www.linkedin.com/in/dailidong
>> >Twitter: @WorkflowEasy
>> >---------------
>> >
>> >On Sun, Nov 21, 2021 at 3:59 PM 王峰 <[email protected]> wrote:
>> >>
>> >> doplhinscheduler 1.3.3 cluster
>> >>
>> >>
>> >>
>> >>
>> >> There is such a scenario, because the host load is too high, master
>> fault tolerance may occur in the middle, and the same workflow instance is
>> run twice (two tasks are parallel in time), which causes the data to double.
>>

Reply via email to