[
https://issues.apache.org/jira/browse/FLINK-35150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
qyw updated FLINK-35150:
------------------------
Attachment: (was: image-2024-04-18-11-20-43-126.png)
> The specified upload does not exist. The upload ID may be invalid
> -----------------------------------------------------------------
>
> Key: FLINK-35150
> URL: https://issues.apache.org/jira/browse/FLINK-35150
> Project: Flink
> Issue Type: Bug
> Components: Connectors / FileSystem
> Affects Versions: 1.15.0
> Reporter: qyw
> Priority: Major
> Attachments: image-2024-04-18-10-51-05-071.png,
> image-2024-04-18-11-03-08-998.png, image-2024-04-18-11-20-25-583.png
>
>
> Flink S3 hadoop, write S3 in csv mode, I used this patch FLINK-28513 . But
> I don't understand why S3RecoverableFsDataOutputStream "sync" method of this
> class to be "completeMultipartUpload" operation, if "completeMultipartUpload"
> here, Calling close later to upload the rest of the stream will inevitably
> result in an error. The part corresponding to uploadID has been merged.
> Therefore, when the message in csv is larger than
> "S3_MULTIPART_MIN_PART_SIZE", the uploadPart will be started when switching
> files, then when BulkPartWriter performs closeForCommit, Due to the sync
> S3RecoverableFsDataOutputStream method call completeMultipartUpload, So
> S3RecoverableFsDataOutputStream "closeForCommit" method due to the
> uploadPart, at this time will lead to errors.
>
> BulkPartWriter:
> !image-2024-04-18-11-03-08-998.png!
> CsvBulkWriter:
> !image-2024-04-18-11-20-43-126.png!
> S3RecoverableFsDataOutputStream:
> !image-2024-04-18-10-51-05-071.png!
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)