steveloughran commented on code in PR #15210:
URL: https://github.com/apache/iceberg/pull/15210#discussion_r2822189298
##########
aws/src/main/java/org/apache/iceberg/aws/s3/S3OutputStream.java:
##########
@@ -407,6 +407,9 @@ private void cleanUpStagingFiles() {
.suppressFailureWhenFinished()
.onFailure((file, thrown) -> LOG.warn("Failed to delete staging file:
{}", file, thrown))
.run(File::delete);
+ // clear staging files and multipart map
+ stagingFiles.clear();
+ multiPartMap.clear();
Review Comment:
you're limited to 10k parts so even on long-lived uploads I don't see that
much memory being consumed from a single stream.
having many long-lived streams, that's a problem, but it's a scale problem
in general.
You may be encountering the problem that the ContentStreamProviders in the
aws sdk like to buffer contents into memory - though fromFile() is actually the
good one here, at least in the latest SDKs
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]