[ 
https://issues.apache.org/jira/browse/BEAM-10077?focusedWorklogId=437496&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-437496
 ]

ASF GitHub Bot logged work on BEAM-10077:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 27/May/20 03:57
            Start Date: 27/May/20 03:57
    Worklog Time Spent: 10m 
      Work Description: ihji commented on a change in pull request #11813:
URL: https://github.com/apache/beam/pull/11813#discussion_r430576598



##########
File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/Environments.java
##########
@@ -264,6 +259,14 @@ public static Environment createProcessEnvironment(
                   .build()
                   .toByteString());
         }
+        if (stagedName == null) {
+          stagedName = createStagingFileName(file, hashCode);

Review comment:
       I believe existing unit test covers this path. This is the path when a 
given path doesn't have special '=' character. In that case, we generate the 
staged name.

##########
File path: 
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
##########
@@ -211,8 +210,8 @@
 
   @VisibleForTesting static final int GCS_UPLOAD_BUFFER_SIZE_BYTES_DEFAULT = 
1024 * 1024;
 
-  @VisibleForTesting static final String PIPELINE_FILE_FORMAT = 
"pipeline-%s.pb";
-  @VisibleForTesting static final String DATAFLOW_GRAPH_FILE_FORMAT = 
"dataflow_graph-%s.json";
+  @VisibleForTesting static final String PIPELINE_FILE_NAME = "pipeline.pb";

Review comment:
       In the previous PR, `forBytesToStage` only respected the target name as 
is so we needed to make the target name unique. Now `forBytesToStage` generates 
unique names by itself so we don't need to put UUID suffix manually.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 437496)
    Time Spent: 1h  (was: 50m)

> using filename + hash instead of UUID for staging name
> ------------------------------------------------------
>
>                 Key: BEAM-10077
>                 URL: https://issues.apache.org/jira/browse/BEAM-10077
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>            Reporter: Heejong Lee
>            Assignee: Heejong Lee
>            Priority: P2
>             Fix For: 2.22.0
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Recent change BEAM-9383 disabled the artifact caching logic for GCS by object 
> names. Changing staging name generation from UUID to filename + hash will 
> re-enable the artifact caching so we can avoid re-uploading same artifact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to