[
https://issues.apache.org/jira/browse/GOBBLIN-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
William Lo updated GOBBLIN-1865:
--------------------------------
Description:
With "gobblin.cluster.job.useGeneratedJobIds" configuration, jobs with that
prefix should be using the system timestamp of Gobblin cluster instead of a
provided flow execution ID.
Instead of this, it is more consistent to append flowExecutionId to a jobName
then append a timestamp on top of that, so that all earlystop jobs relating to
a flow execution can be tracked.
Now jobNames should have the following structure:
job_ActualJob<jobName>{_}<flowExecutionId>{_}<timestamp>
The timestamp is needed so that Helix can run concurrent jobs given a job ID.
was:
With "gobblin.cluster.job.useGeneratedJobIds" configuration, jobs with that
prefix should be using the system timestamp of Gobblin cluster instead of a
provided flow execution ID.
Instead of this, it is more consistent to append flowExecutionId to a jobName
then append a timestamp on top of that, so that all earlystop jobs relating to
a flow execution can be tracked.
Now jobNames should have the following structure:
job_ActualJob<jobName>_<flowExecutionId>_<timestamp>
> Fix bug where overriding job execution ids cause issue with earlystop jobs
> and job tracking
> -------------------------------------------------------------------------------------------
>
> Key: GOBBLIN-1865
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1865
> Project: Apache Gobblin
> Issue Type: Bug
> Components: gobblin-cluster
> Reporter: William Lo
> Assignee: Hung Tran
> Priority: Major
>
> With "gobblin.cluster.job.useGeneratedJobIds" configuration, jobs with that
> prefix should be using the system timestamp of Gobblin cluster instead of a
> provided flow execution ID.
> Instead of this, it is more consistent to append flowExecutionId to a jobName
> then append a timestamp on top of that, so that all earlystop jobs relating
> to a flow execution can be tracked.
> Now jobNames should have the following structure:
> job_ActualJob<jobName>{_}<flowExecutionId>{_}<timestamp>
> The timestamp is needed so that Helix can run concurrent jobs given a job ID.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)