[ 
https://issues.apache.org/jira/browse/HADOOP-17847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536329#comment-17536329
 ] 

Luca M edited comment on HADOOP-17847 at 5/12/22 9:55 PM:
----------------------------------------------------------

Hi 
I am seeing a similar issue in our project and not sure if to create a separate 
JIRA or comment on this. 
The S3 connections seems to be successfully completed but the statistics are 
logging warnings about pending data. We are using hadoop=3.3.2, 
aws-java-sdk-bundle=1.11.1026

Here the log snippet with relevant information
{code}
[09-May-2022 22:13:05.013 UTC] DEBUG org.apache.hadoop.fs.s3a.S3AFileSystem     
                  [] - PUT completed success=true; 48819 bytes
*****************
[09-May-2022 22:13:05.013 UTC] DEBUG org.apache.hadoop.fs.s3a.S3AFileSystem     
                  [] - Finished write to PATH_WITH_FILENAME, len 48819. etag 
**, version **
***************
[09-May-2022 22:13:05.012 UTC] DEBUG 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream                [] - Upload 
complete to PATH_WITH_FILENAME by WriteOperationHelper \{bucket=BUCKETNAME}
*****************
09-May-2022 22:13:05.014 UTC] WARN org.apache.hadoop.fs.s3a.S3AInstrumentation  
                [] - Closing output stream statistics while data is still 
marked as pending upload in OutputStreamStatistics
******
[09-May-2022 22:13:05.014 UTC] WARN org.apache.hadoop.fs.s3a.S3AInstrumentation 
                 [] - Closing output stream statistics while data is still 
marked as pending upload in 
OutputStreamStatistics{counters=((stream_write_queue_duration=0) 
(action_executor_acquired.failures=0) (op_abort.failures=0) 
(stream_write_bytes=48819) (op_abort=0) (action_executor_acquired=0) 
(multipart_upload_completed.failures=0) (object_multipart_aborted.failures=0) 
(op_hsync=0) (op_hflush=0) (stream_write_exceptions_completing_upload=0) 
(object_multipart_aborted=0) (stream_write_total_data=40960) 
(stream_write_block_uploads=1) (stream_write_exceptions=0) 
(stream_write_total_time=0) (multipart_upload_completed=0));
gauges=((stream_write_block_uploads_data_pending=7859) 
(stream_write_block_uploads_pending=1));
minimums=((action_executor_acquired.min=-1) (multipart_upload_completed.min=-1) 
(object_multipart_aborted.min=-1) (op_abort.failures.min=-1) 
(multipart_upload_completed.failures.min=-1) 
(action_executor_acquired.failures.min=-1) 
(object_multipart_aborted.failures.min=-1) (op_abort.min=-1));
maximums=((object_multipart_aborted.max=-1) 
(multipart_upload_completed.failures.max=-1) (action_executor_acquired.max=-1) 
(op_abort.max=-1) (multipart_upload_completed.max=-1) 
(op_abort.failures.max=-1) (action_executor_acquired.failures.max=-1) 
(object_multipart_aborted.failures.max=-1));
means=((action_executor_acquired.mean=(samples=0, sum=0, mean=0.0000)) 
(object_multipart_aborted.mean=(samples=0, sum=0, mean=0.0000)) 
(op_abort.failures.mean=(samples=0, sum=0, mean=0.0000)) 
(multipart_upload_completed.mean=(samples=0, sum=0, mean=0.0000)) 
(object_multipart_aborted.failures.mean=(samples=0, sum=0, mean=0.0000)) 
(multipart_upload_completed.failures.mean=(samples=0, sum=0, mean=0.0000)) 
(action_executor_acquired.failures.mean=(samples=0, sum=0, mean=0.0000)) 
(op_abort.mean=(samples=0, sum=0, mean=0.0000)));

stream_write_total_data=40960 + stream_write_block_uploads_data_pending=7859 = 
stream_write_bytes=48819. However, previous logs seems to indicate that the 
whole payload has been correctly updated.
{code}

I downloaded the file and verified that the size was indeed 48819 bytes.

Let me know if you prefer to create a separate JIRA and need more info.

Thanks!


was (Author: JIRAUSER289406):
Hi 
I am seeing a similar issue in our project and not sure if to create a separate 
JIRA or comment on this. 
The S3 connections seems to be successfully completed but the statistics are 
logging warnings about pending data. We are using hadoop=3.3.2, 
aws-java-sdk-bundle=1.11.1026

Here the log snippet with relevant information
{code}
[09-May-2022 22:13:05.013 UTC] DEBUG org.apache.hadoop.fs.s3a.S3AFileSystem     
                  [] - PUT completed success=true; 48819 bytes
*****************
[09-May-2022 22:13:05.013 UTC] DEBUG org.apache.hadoop.fs.s3a.S3AFileSystem     
                  [] - Finished write to PATH_WITH_FILENAME, len 48819. etag 
**, version **
***************
[09-May-2022 22:13:05.012 UTC] DEBUG 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream                [] - Upload 
complete to PATH_WITH_FILENAME by WriteOperationHelper \{bucket=BUCKETNAME}
*****************
09-May-2022 22:13:05.014 UTC] WARN org.apache.hadoop.fs.s3a.S3AInstrumentation  
                [] - Closing output stream statistics while data is still 
marked as pending upload in OutputStreamStatistics
******
[09-May-2022 22:13:05.014 UTC] WARN org.apache.hadoop.fs.s3a.S3AInstrumentation 
                 [] - Closing output stream statistics while data is still 
marked as pending upload in 
OutputStreamStatistics{counters=((stream_write_queue_duration=0) 
(action_executor_acquired.failures=0) (op_abort.failures=0) 
(stream_write_bytes=48819) (op_abort=0) (action_executor_acquired=0) 
(multipart_upload_completed.failures=0) (object_multipart_aborted.failures=0) 
(op_hsync=0) (op_hflush=0) (stream_write_exceptions_completing_upload=0) 
(object_multipart_aborted=0) (stream_write_total_data=40960) 
(stream_write_block_uploads=1) (stream_write_exceptions=0) 
(stream_write_total_time=0) (multipart_upload_completed=0));
gauges=((stream_write_block_uploads_data_pending=7859) 
(stream_write_block_uploads_pending=1));
minimums=((action_executor_acquired.min=-1) (multipart_upload_completed.min=-1) 
(object_multipart_aborted.min=-1) (op_abort.failures.min=-1) 
(multipart_upload_completed.failures.min=-1) 
(action_executor_acquired.failures.min=-1) 
(object_multipart_aborted.failures.min=-1) (op_abort.min=-1));
maximums=((object_multipart_aborted.max=-1) 
(multipart_upload_completed.failures.max=-1) (action_executor_acquired.max=-1) 
(op_abort.max=-1) (multipart_upload_completed.max=-1) 
(op_abort.failures.max=-1) (action_executor_acquired.failures.max=-1) 
(object_multipart_aborted.failures.max=-1));
means=((action_executor_acquired.mean=(samples=0, sum=0, mean=0.0000)) 
(object_multipart_aborted.mean=(samples=0, sum=0, mean=0.0000)) 
(op_abort.failures.mean=(samples=0, sum=0, mean=0.0000)) 
(multipart_upload_completed.mean=(samples=0, sum=0, mean=0.0000)) 
(object_multipart_aborted.failures.mean=(samples=0, sum=0, mean=0.0000)) 
(multipart_upload_completed.failures.mean=(samples=0, sum=0, mean=0.0000)) 
(action_executor_acquired.failures.mean=(samples=0, sum=0, mean=0.0000)) 
(op_abort.mean=(samples=0, sum=0, mean=0.0000)));

stream_write_total_data=40960 + stream_write_block_uploads_data_pending=7859 = 
stream_write_bytes=48819. However, previous logs seems to indicate that the 
whole payload has been correctly updated.
{code}
Let me know if you prefer to create a separate JIRA and need more info.

Thanks!

> S3AInstrumentation Closing output stream statistics while data is still 
> marked as pending upload in OutputStreamStatistics
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-17847
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17847
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.2.1
>         Environment: hadoop: 3.2.1
> spark: 3.0.2
> k8s server version: 1.18
> aws.java.sdk.bundle.version:1.11.1033
>            Reporter: Li Rong
>            Priority: Major
>         Attachments: logs.txt
>
>
> When using hadoop s3a file upload for spark event Logs, the logs were queued 
> up and not uploaded before the process is shut down:
> {code:java}
> // 21/08/13 12:22:39 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client 
> has been closed (this is expected if the application is shutting down.)
> 21/08/13 12:22:39 WARN S3AInstrumentation: Closing output stream statistics 
> while data is still marked as pending upload in 
> OutputStreamStatistics{blocksSubmitted=1, blocksInQueue=1, blocksActive=0, 
> blockUploadsCompleted=0, blockUploadsFailed=0, bytesPendingUpload=106716, 
> bytesUploaded=0, blocksAllocated=1, blocksReleased=1, 
> blocksActivelyAllocated=0, exceptionsInMultipartFinalize=0, 
> transferDuration=0 ms, queueDuration=0 ms, averageQueueTime=0 ms, 
> totalUploadDuration=0 ms, effectiveBandwidth=0.0 bytes/s}{code}
> details see logs attached



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to