[
https://issues.apache.org/jira/browse/HADOOP-18695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17712300#comment-17712300
]
ASF GitHub Bot commented on HADOOP-18695:
-----------------------------------------
dannycjones commented on code in PR #5548:
URL: https://github.com/apache/hadoop/pull/5548#discussion_r1166484677
##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##########
@@ -2590,6 +2591,25 @@ protected DurationTrackerFactory
getDurationTrackerFactory() {
: null;
}
+ /**
+ * Given a possibly null duration tracker factory, return a non-null
+ * one for use in tracking durations.
+ *
+ * @param factory factory.
+ * @return a non-null factory.
+ */
+ protected DurationTrackerFactory nonNullDurationTrackerFactory(
+ DurationTrackerFactory factory) {
+ DurationTrackerFactory f = factory;
+ if (f == null) {
+ f = getDurationTrackerFactory();
+ if (f == null) {
+ f = emptyStatisticsStore();
+ }
+ }
+ return f;
+ }
Review Comment:
Why do we add this non-null method if we use this method to create duration
trackers anyway?
https://github.com/apache/hadoop/blob/f6fa5bd1aa085a4d22f3450b545bb70063da9f51/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/impl/IOStatisticsBinding.java#L665-L678
##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AMiscOperations.java:
##########
@@ -114,7 +114,7 @@ public void testPutObjectDirect() throws Throwable {
new ByteArrayInputStream("PUT".getBytes()),
metadata);
LambdaTestUtils.intercept(IllegalStateException.class,
- () -> fs.putObjectDirect(put, PutObjectOptions.keepingDirs()));
+ () -> fs.putObjectDirect(put, PutObjectOptions.keepingDirs(), null));
Review Comment:
Why not grab the duration tracker from the FS here?
```suggestion
() -> fs.putObjectDirect(put, PutObjectOptions.keepingDirs(),
fs.getDurationTrackerFactory()));
```
##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AWSRequestAnalyzer.java:
##########
@@ -216,6 +216,18 @@ private RequestInfo writing(final String verb,
|| request instanceof CompleteMultipartUploadRequest
|| request instanceof GetBucketLocationRequest;
}
+ /**
+ * Predicate which returns true if the request is part of the
+ * multipart upload API -and which therefore must be rejected
+ * if multipart upload is disabled.
+ * @param request request
+ * @return true if the transfer manager creates them.
+ */
+ public static boolean isRequestMultipartIO(final Object request) {
+ return request instanceof CopyPartRequest
+ || request instanceof CompleteMultipartUploadRequest
+ || request instanceof InitiateMultipartUploadRequest;
+ }
Review Comment:
I think we're missing `UploadPartRequest` for any new parts being uploaded.
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/UploadPartRequest.html
##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/AbstractSTestS3AHugeFiles.java:
##########
@@ -192,13 +201,7 @@ public void test_010_CreateHugeFile() throws IOException {
true,
uploadBlockSize,
progress)) {
- try {
- streamStatistics = getOutputStreamStatistics(out);
- } catch (ClassCastException e) {
- LOG.info("Wrapped output stream is not block stream: {}",
- out.getWrappedStream());
- streamStatistics = null;
- }
+ streamStatistics = getOutputStreamStatistics(out);
Review Comment:
why do we drop this? it wasn't right in the first place, so we should fail
if we hit it?
##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestAuditIntegration.java:
##########
@@ -67,6 +71,22 @@ public void testExceptionTranslation() throws Throwable {
});
}
+ /**
+ * UnsupportedRequest mapping and fail fast outcome.
+ */
+ @Test
+ public void testUnsupportedExceptionTranslation() throws Throwable {
+ final UnsupportedRequestException ex =
intercept(UnsupportedRequestException.class, () -> {
+ throw translateException("test", "/",
+ new AuditOperationRejectedException("not supported"));
+ });
+ final S3ARetryPolicy retryPolicy = new S3ARetryPolicy(new
Configuration(false));
+ final RetryPolicy.RetryAction action = retryPolicy.shouldRetry(ex, 1, 1,
true);
Review Comment:
This should return "fail" request action with 0 for "retries" and
"failovers", no?
```suggestion
final RetryPolicy.RetryAction action = retryPolicy.shouldRetry(ex, 0, 0,
true);
```
##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/AbstractSTestS3AHugeFiles.java:
##########
@@ -119,6 +119,14 @@ protected Configuration createScaleConfiguration() {
DEFAULT_HUGE_PARTITION_SIZE);
assertTrue("Partition size too small: " + partitionSize,
partitionSize >= MULTIPART_MIN_SIZE);
+ removeBaseAndBucketOverrides(conf,
+ SOCKET_SEND_BUFFER,
+ SOCKET_RECV_BUFFER,
+ MIN_MULTIPART_THRESHOLD,
+ MULTIPART_SIZE,
+ USER_AGENT_PREFIX,
+ FAST_UPLOAD_BUFFER);
Review Comment:
random comment on maintainability of the tests in S3A
would a wrapper for `Configuration` make more sense, where this wrapper
would remove base/bucket overrides for any method you set?
This wrapper would just intercept `Configuration::set(String name, String
value, String source)`, with a body like:
```java
/** Wrapper for Configuration which drops any bucket overrides when setting
a configuration **/
public TestConfiguration extends Configuration {
private Configuration innerConfiguration;
public TestConfiguration(Configuration conf) {
this.innerConfiguration = conf;
}
public void set(String name, String value, String source) {
this.removeBaseAndBucketOverrides(name);
this.innerConfiguration.set(name, value, source);
}
}
```
would be a mess to update all existing tests of course
##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/WriteOperationHelper.java:
##########
@@ -590,10 +593,9 @@ public void uploadObject(PutObjectRequest putObjectRequest,
PutObjectOptions putOptions)
throws IOException {
- retry("Writing Object",
- putObjectRequest.getKey(), true,
- withinAuditSpan(getAuditSpan(), () ->
- owner.putObjectDirect(putObjectRequest, putOptions)));
+ // the transfer manager is not involved; instead it is directly
+ // PUT.
+ putObject(putObjectRequest, putOptions, null);
Review Comment:
Why the change? Is it a change?
The JavaDoc is a bit confusing. It suggests this method might use transfer
manager, but we don't. Looking at `WriteOperations`, its JavaDoc says "PUT an
object directly (i.e. not via the transfer manager).", which conflicts with
this methods javadoc.
> S3A: reject multipart copy requests when disabled
> -------------------------------------------------
>
> Key: HADOOP-18695
> URL: https://issues.apache.org/jira/browse/HADOOP-18695
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/s3
> Affects Versions: 3.4.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Minor
> Labels: pull-request-available
>
> follow-on to HADOOP-18637 and support for huge file uploads with stores which
> don't support MPU.
> * prevent use of API against any s3 store when disabled, using logging
> auditor to reject it
> * tests to verify rename of huge files still works (by setting large part
> size)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]