[
https://issues.apache.org/jira/browse/HADOOP-17847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536329#comment-17536329
]
Luca M commented on HADOOP-17847:
---------------------------------
Hi
I am seeing a similar issue in our project and not sure if to create a separate
JIRA or comment on this.
The S3 connections seems to be successfully completed but the statistics are
logging warnings about pending data. We are using hadoop=3.3.2,
aws-java-sdk-bundle=1.11.931
Here the log snippet with relevant information
{code}
[09-May-2022 22:13:05.013 UTC] DEBUG org.apache.hadoop.fs.s3a.S3AFileSystem
[] - PUT completed success=true; 48819 bytes
*****************
[09-May-2022 22:13:05.013 UTC] DEBUG org.apache.hadoop.fs.s3a.S3AFileSystem
[] - Finished write to PATH_WITH_FILENAME, len 48819. etag
**, version **
***************
[09-May-2022 22:13:05.012 UTC] DEBUG
org.apache.hadoop.fs.s3a.S3ABlockOutputStream [] - Upload
complete to PATH_WITH_FILENAME by WriteOperationHelper \{bucket=BUCKETNAME}
*****************
09-May-2022 22:13:05.014 UTC] WARN org.apache.hadoop.fs.s3a.S3AInstrumentation
[] - Closing output stream statistics while data is still
marked as pending upload in OutputStreamStatistics
******
[09-May-2022 22:13:05.014 UTC] WARN org.apache.hadoop.fs.s3a.S3AInstrumentation
[] - Closing output stream statistics while data is still
marked as pending upload in
OutputStreamStatistics{counters=((stream_write_queue_duration=0)
(action_executor_acquired.failures=0) (op_abort.failures=0)
(stream_write_bytes=48819) (op_abort=0) (action_executor_acquired=0)
(multipart_upload_completed.failures=0) (object_multipart_aborted.failures=0)
(op_hsync=0) (op_hflush=0) (stream_write_exceptions_completing_upload=0)
(object_multipart_aborted=0) (stream_write_total_data=40960)
(stream_write_block_uploads=1) (stream_write_exceptions=0)
(stream_write_total_time=0) (multipart_upload_completed=0));
gauges=((stream_write_block_uploads_data_pending=7859)
(stream_write_block_uploads_pending=1));
minimums=((action_executor_acquired.min=-1) (multipart_upload_completed.min=-1)
(object_multipart_aborted.min=-1) (op_abort.failures.min=-1)
(multipart_upload_completed.failures.min=-1)
(action_executor_acquired.failures.min=-1)
(object_multipart_aborted.failures.min=-1) (op_abort.min=-1));
maximums=((object_multipart_aborted.max=-1)
(multipart_upload_completed.failures.max=-1) (action_executor_acquired.max=-1)
(op_abort.max=-1) (multipart_upload_completed.max=-1)
(op_abort.failures.max=-1) (action_executor_acquired.failures.max=-1)
(object_multipart_aborted.failures.max=-1));
means=((action_executor_acquired.mean=(samples=0, sum=0, mean=0.0000))
(object_multipart_aborted.mean=(samples=0, sum=0, mean=0.0000))
(op_abort.failures.mean=(samples=0, sum=0, mean=0.0000))
(multipart_upload_completed.mean=(samples=0, sum=0, mean=0.0000))
(object_multipart_aborted.failures.mean=(samples=0, sum=0, mean=0.0000))
(multipart_upload_completed.failures.mean=(samples=0, sum=0, mean=0.0000))
(action_executor_acquired.failures.mean=(samples=0, sum=0, mean=0.0000))
(op_abort.mean=(samples=0, sum=0, mean=0.0000)));
stream_write_total_data=40960 + stream_write_block_uploads_data_pending=7859 =
stream_write_bytes=48819. However, previous logs seems to indicate that the
whole payload has been correctly updated.
{code}
Let me know if you prefer to create a separate JIRA and need more info.
Thanks!
> S3AInstrumentation Closing output stream statistics while data is still
> marked as pending upload in OutputStreamStatistics
> --------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-17847
> URL: https://issues.apache.org/jira/browse/HADOOP-17847
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 3.2.1
> Environment: hadoop: 3.2.1
> spark: 3.0.2
> k8s server version: 1.18
> aws.java.sdk.bundle.version:1.11.1033
> Reporter: Li Rong
> Priority: Major
> Attachments: logs.txt
>
>
> When using hadoop s3a file upload for spark event Logs, the logs were queued
> up and not uploaded before the process is shut down:
> {code:java}
> // 21/08/13 12:22:39 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client
> has been closed (this is expected if the application is shutting down.)
> 21/08/13 12:22:39 WARN S3AInstrumentation: Closing output stream statistics
> while data is still marked as pending upload in
> OutputStreamStatistics{blocksSubmitted=1, blocksInQueue=1, blocksActive=0,
> blockUploadsCompleted=0, blockUploadsFailed=0, bytesPendingUpload=106716,
> bytesUploaded=0, blocksAllocated=1, blocksReleased=1,
> blocksActivelyAllocated=0, exceptionsInMultipartFinalize=0,
> transferDuration=0 ms, queueDuration=0 ms, averageQueueTime=0 ms,
> totalUploadDuration=0 ms, effectiveBandwidth=0.0 bytes/s}{code}
> details see logs attached
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]