Hanghang Liu created GOBBLIN-1947:
-------------------------------------
Summary: Send WorkUnitChangeEvent when helix task consistently
fail
Key: GOBBLIN-1947
URL: https://issues.apache.org/jira/browse/GOBBLIN-1947
Project: Apache Gobblin
Issue Type: New Feature
Components: gobblin-cluster
Reporter: Hanghang Liu
Assignee: Hung Tran
When YarnAutoScalingManager detect helix task consistently fail, give an option
to send WorkUnitChangeEvent to let GobblinHelixJobLauncher handle the event and
split the work unit during runtime. This can help resolving consistent failing
containers issue(like OOM) during runtime instead of relying on replaner to
restart the whole pipeline
--
This message was sent by Atlassian Jira
(v8.20.10#820010)