William Lo created GOBBLIN-2126:
-----------------------------------

             Summary: Implement caching for resources uploaded to hdfs by 
Gobblin Yarn jobs
                 Key: GOBBLIN-2126
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-2126
             Project: Apache Gobblin
          Issue Type: Improvement
            Reporter: William Lo


Currently Gobblin Yarn jobs will continuously reupload jars to HDFS for each 
execution.
We want to instead keep a running cache, similar to MR which gets cleaned up at 
a monthly interval (can be configured in the future) where it will ensure that 
files do not get repeatedly uploaded to HDFS which is a slow operation. 

This should lead to significant performance improvements in the bootstrapping 
of a YARN application in Gobblin for Temporal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to