Hanghang Liu created GOBBLIN-1947:
-------------------------------------

             Summary: Send WorkUnitChangeEvent when helix task consistently 
fail 
                 Key: GOBBLIN-1947
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1947
             Project: Apache Gobblin
          Issue Type: New Feature
          Components: gobblin-cluster
            Reporter: Hanghang Liu
            Assignee: Hung Tran


When YarnAutoScalingManager detect helix task consistently fail, give an option 
to send WorkUnitChangeEvent to let GobblinHelixJobLauncher handle the event and 
split the work unit during runtime. This can help resolving consistent failing 
containers issue(like OOM) during runtime instead of relying on replaner to 
restart the whole pipeline



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to