[
https://issues.apache.org/jira/browse/BEAM-9078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17017453#comment-17017453
]
Boyuan Zhang commented on BEAM-9078:
------------------------------------
Hey, 2.19.0 release branch has been cut. Any updates on this issue?
> Large Tarball Artifacts Should Use GCS Resumable Upload
> -------------------------------------------------------
>
> Key: BEAM-9078
> URL: https://issues.apache.org/jira/browse/BEAM-9078
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Affects Versions: 2.17.0
> Reporter: Brad West
> Assignee: Brad West
> Priority: Major
> Fix For: 2.19.0
>
> Original Estimate: 1h
> Time Spent: 40m
> Remaining Estimate: 20m
>
> It's possible for the tarball uploaded to GCS to be quite large. An example
> is a user vendoring multiple dependencies in their tarball so as to achieve a
> more stable deployable artifact.
> Before this change the GCS upload api call executed a multipart upload, which
> Google
> [documentation]([https://cloud.google.com/storage/docs/json_api/v1/how-tos/upload)]
> states should be used when the file is small enough to upload again when the
> connection fails. For large tarballs, we will hit 60 second socket timeouts
> before completing the multipart upload. By passing `total_size`, apitools
> first checks if the size exceeds the resumable upload threshold, and executes
> the more robust resumable upload rather than a multipart, avoiding
> socket timeouts.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)