[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254401#comment-15254401
 ] 

Steve Loughran commented on HADOOP-13028:
-----------------------------------------

colin, about to push up my patch

# Nobody had told me of HDFS-10175, never mind
# I'm using the classic MetricsRegistry, with all the instrumentation lifted 
from Azure, made the text/keys more generic, so the counters could be used for 
other object stores
# added a metrics to string builder, so the S3AFileSystem. toString() operation 
can just do a complete dump of the stats. This is handy as it lets me print out 
the statistics of a run even with code built against older Hadoop versions.
# Note that in the object stores, its not so much "per FS method" we're 
counting, but "per object store API method". E.g. We're counting the number of 
copy operations in a rename; the number of bytes copied remotely, the deletes 
that take place there, etc, etc.

Because this code is the usual metrics stuff, it slots in quite nicely to what 
there already is. It does add one class to Hadoop common, MetricStringBuilder, 
which I've put there for its generic usability. 

{code}
S3AFileSystem{uri=s3a://landsat-pds, workingDir=s3a://landsat-pds/user/stevel, 
partSize=104857600, enableMultiObjectsDelete=true, 
multiPartThreshold=2147483647, serverSideEncryptionAlgorithm='null', statistics 
{3843 bytes read, 0 bytes written, 2 read ops, 0 large read ops, 0 write ops}, 
metrics {{Context=S3AFileSystem} 
{FileSystemId=9042fe44-6438-4cc5-b3bf-d594dc71e699} {streamOpened=7} 
{streamCloseOperations=6} {streamClosed=1} {streamAborted=5} 
{streamSeekOperations=5} {readExceptions=0} {forwardSeekOperations=3} 
{backwardSeekOperations=2} {bytesSkippedOnSeek=767} {files_created=0} 
{files_copied=0} {files_copied_bytes=0} {files_deleted=0} 
{directories_created=0} {directories_deleted=0} {ignored_errors=0} }}
{code}

> add counter and timer metrics for S3A HTTP & low-level operations
> -----------------------------------------------------------------
>
>                 Key: HADOOP-13028
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13028
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3, metrics
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to