[ https://issues.apache.org/jira/browse/GOBBLIN-227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joel Baranick updated GOBBLIN-227: ---------------------------------- Description: *Precondition:* Using Hocon configuration and have two forks configured. *Summary:* When {{AbstractJobLauncher}} calls {{JobLauncherUtils.cleanTaskStagingData}} it tries to lookup {{writer.staging.dir}} in the configuration and fails. *Details:* Hocon configuration doesn't allow the following config: {code:none} writer.staging.dir=/foo writer.staging.dir.0=/foo writer.staging.dir.1=/foo {code} Initially {{writer.staging.dir}} is of type String, but when the Hocon parser encounters {{writer.staging.dir.0}}, it decides that {{writer.staging.dir}} is now of type Object, overwriting the prior value with {{_\{"0": "/foo"\}_}}. The effective Hocon configuration is: {code:javascript} { "writer": { "staging": { "dir": { "0": "/foo", "1": "/foo" } } } } {code} Fork specific configuration uses the same config keys as regular configuration except the fork number is appended like: {{.1}}. The code that looks up fork specific configuration doesn't automatically fallback to regular configuration. For example, if the code is trying to find {{writer.staging.dir.0}} and it isn't configured, the job will fail. Then means that all forks must configure fork specific versions of {{writer.staging.dir}}. When {{AbstractJobLauncher}} calls {{JobLauncherUtils.cleanTaskStagingData}} it cleans up the based on the current job's configuration. Because of this, {{fork.branches}} is always set to {{1}}. The call to {{WriterUtils.getWriterStagingDir(state, numBranches, branchId)}} is made with {{numBranches=1}} and {{branchId=0}}. This results in the method looking for {{writer.staging.dir}}. Unfortunately, when using Hocon configuration the value {{writer.staging.dir}} doesn't exist and the job fails. was: *Precondition:* Using Hocon configuration and have two forks configured. *Summary:* When {{AbstractJobLauncher}} calls {{JobLauncherUtils.cleanTaskStagingData}} it tries to lookup {{writer.staging.dir}} in the configuration and fails. *Details:* Hocon configuration doesn't allow the following config: {code:none} writer.staging.dir=/foo writer.staging.dir.0=/foo writer.staging.dir.1=/foo {code} Initially {{writer.staging.dir}} is of type String, but when the Hocon parser encounters {{writer.staging.dir.0}}, it decides that {{writer.staging.dir}} is now of type Object, overwriting the prior value with {{_\{"0": "/foo"\}_}}. The effective Hocon configuration is: {code:javascript} { "writer": { "staging": { "dir": { "0": "/foo", "1": "/foo" } } } } {code} Fork specific configuration uses the same config keys as regular configuration except the fork number is appended like: {{.1}}. The code that looks up fork specific configuration doesn't automatically fallback to regular configuration. For example, if the code is trying to find {{writer.staging.dir.0}} and it isn't configured, the job will fail. Then means that all forks must configure fork specific versions of {{writer.staging.dir}}. When {{AbstractJobLauncher}} calls {{JobLauncherUtils.cleanTaskStagingData}} it cleans up the based on the current job's configuration. Because of this, {{fork.branches}} is always set to {{1}}. The call to {{WriterUtils.getWriterStagingDir(state, numBranches, branchId)}} is make with {{numBranches=1}} and {{branchId=0}}. This results in the method looking for {{writer.staging.dir}}. Unfortunately, when using Hocon configuration the value {{writer.staging.dir}} doesn't exist and the job fails. > JobLauncherUtils.cleanTaskStagingData fails for jobs with forks > --------------------------------------------------------------- > > Key: GOBBLIN-227 > URL: https://issues.apache.org/jira/browse/GOBBLIN-227 > Project: Apache Gobblin > Issue Type: Bug > Reporter: Joel Baranick > > *Precondition:* > Using Hocon configuration and have two forks configured. > *Summary:* > When {{AbstractJobLauncher}} calls {{JobLauncherUtils.cleanTaskStagingData}} > it tries to lookup {{writer.staging.dir}} in the configuration and fails. > *Details:* > Hocon configuration doesn't allow the following config: > {code:none} > writer.staging.dir=/foo > writer.staging.dir.0=/foo > writer.staging.dir.1=/foo > {code} > Initially {{writer.staging.dir}} is of type String, but when the Hocon parser > encounters {{writer.staging.dir.0}}, it decides that {{writer.staging.dir}} > is now of type Object, overwriting the prior value with {{_\{"0": "/foo"\}_}}. > The effective Hocon configuration is: > {code:javascript} > { > "writer": { > "staging": { > "dir": { > "0": "/foo", > "1": "/foo" > } > } > } > } > {code} > Fork specific configuration uses the same config keys as regular > configuration except the fork number is appended like: {{.1}}. The code that > looks up fork specific configuration doesn't automatically fallback to > regular configuration. For example, if the code is trying to find > {{writer.staging.dir.0}} and it isn't configured, the job will fail. Then > means that all forks must configure fork specific versions of > {{writer.staging.dir}}. > When {{AbstractJobLauncher}} calls {{JobLauncherUtils.cleanTaskStagingData}} > it cleans up the based on the current job's configuration. Because of this, > {{fork.branches}} is always set to {{1}}. The call to > {{WriterUtils.getWriterStagingDir(state, numBranches, branchId)}} is made > with {{numBranches=1}} and {{branchId=0}}. This results in the method > looking for {{writer.staging.dir}}. Unfortunately, when using Hocon > configuration the value {{writer.staging.dir}} doesn't exist and the job > fails. -- This message was sent by Atlassian JIRA (v6.4.14#64029)