[
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267400#comment-15267400
]
Chris Nauroth commented on HADOOP-13028:
----------------------------------------
Hello [[email protected]]. I'm still digging into the changes in the seek
code and the tests, but I'd like to share the feedback I have so far for patch
v008.
# Let's put visibility annotations on {{MetricsRecordBuilder}}.
Public/Evolving for agreement with the {{MetricsRecordBuilder}} base class?
# Should {{initMultipartUploads}} increment the ignored error counter?
# {{rename}} has several {{catch}} blocks that don't propagate an exception.
Should these increment the ignored error counter?
# In the following exception message, I think we need an extra space before the
"to". There are 2 different call sites that produce this message, so 2 spots
to fix.
{code}
throw new InterruptedIOException("Interrupted copying " + src
+ "to " + dst + ", cancelling");
{code}
# Can you please include {{src}} and {{dst}} in this log message from
{{rename}}?
{code}
LOG.debug("rename: src or dst are empty");
{code}
# Can you please include {{key}} in this log message from {{delete}}?
{code}
LOG.debug("Deleting fake empty directory");
{code}
# Should {{S3AFileSystem#toString}} also include {{maxKeys}}, {{cannedACL}} and
{{readAhead}}?
# In {{S3AInputStream#reopen}}, is the following log message redundant,
considering the call to {{closeStream}} will do its own logging?
{code}
LOG.debug("Closing the previous stream");
closeStream("reopen(" + reason + ")", requestedStreamLen);
{code}
# {{S3AInputStream#setReadahead}} doesn't exactly match the specification
defined in {{CanSetReadahead}}. The interface says that {{null}} means to use
the default, but the implementation here rejects {{null}}. This could be
problematic for more complex use cases, such as someone wanting to
programmatically control the amount of readahead. If they called
{{setReadahead}} with a custom value, then I think ideally we should allow them
to call it with {{null}} later, and restore back to the default from
configuration. (I admit this is an edge case, but a {{DFSInputStream}} does
allow this behavior.)
# {{S3AInstrumentation}} receives a {{Configuration}} in its constructor but
doesn't use it. Can it be removed?
# {{S3AInstrumentation#gauge}} appears to be unused.
# {{InputStreamStatistics#toString}} does not include {{readFullyOperations}}.
It looks like there are some CheckStyle and JavaDoc things to follow up on from
that last pre-commit run. The test failure is unrelated.
> add low level counter metrics for S3A; use in read performance tests
> --------------------------------------------------------------------
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3, metrics
> Affects Versions: 2.8.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch,
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch,
> HADOOP-13028-007.patch, HADOOP-13028-008.patch,
> HADOOP-13028-branch-2-008.patch,
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt,
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive,
> closing connections may be expensive (a sign of a regression).
> S3A FS and individual input streams should have counters of the # of
> open/close/failure+reconnect operations, timers of how long things take. This
> can be used downstream to measure efficiency of the code (how often
> connections are being made), connection reliability, etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]