[ 
https://issues.apache.org/jira/browse/GOBBLIN-1653?focusedWorklogId=777078&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777078
 ]

ASF GitHub Bot logged work on GOBBLIN-1653:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Jun/22 18:14
            Start Date: 01/Jun/22 18:14
    Worklog Time Spent: 10m 
      Work Description: Will-Lo commented on code in PR #3514:
URL: https://github.com/apache/gobblin/pull/3514#discussion_r887162629


##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/spec/JobExecutionPlan.java:
##########
@@ -108,8 +109,13 @@ private static JobSpec buildJobSpec(FlowSpec flowSpec, 
Config jobConfig, Long fl
 
       // Modify the job name to include the flow group, flow name, edge id, 
and a random string to avoid collisions since
       // job names are assumed to be unique within a dag.
-      jobName = Joiner.on(JOB_NAME_COMPONENT_SEPARATION_CHAR).join(flowGroup, 
flowName, jobName, edgeId, flowInputPath.hashCode());
-
+      int hash = flowInputPath.hashCode();
+      jobName = Joiner.on(JOB_NAME_COMPONENT_SEPARATION_CHAR).join(flowGroup, 
flowName, jobName, edgeId, hash);
+      // jobNames are commonly used as a directory name, which is limited to 
255 characters
+      if (jobName.length() >= MAX_JOB_NAME_LENGTH) {
+        // shorten job length to be 128 characters (flowGroup) + (hashed) 
flowName, hashCode length
+        jobName = 
Joiner.on(JOB_NAME_COMPONENT_SEPARATION_CHAR).join(flowGroup, 
flowName.hashCode(), hash);

Review Comment:
   It also includes the hash at the end so unless it also runs on the same 
executor and has the same input path as the original flow, then it wouldn't 
have the same jobname. Also shouldn't be a problem unless it's running 
concurrently.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 777078)
    Time Spent: 1.5h  (was: 1h 20m)

> Long flownames and flowgroup combinations can exceed maximum component length 
> of folder
> ---------------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-1653
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1653
>             Project: Apache Gobblin
>          Issue Type: Bug
>          Components: gobblin-service
>            Reporter: William Lo
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Gobblin uses jobName to create folder paths for temporary work folders. In 
> GaaS, the jobName is composed of the flowGroup, flowName, edge ID, and some 
> hash. This combination can exceed the maximum folder component length if the 
> flowName and flowGroup approaches their maximums (128 characters). Instead of 
> enforcing a shorter flowGroup/flowName (which would require many db 
> migrations), we should shorten the jobName sent to Gobblin as it's only used 
> for temporary file storage.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to